Vision Res. 2012 Feb 15;55:41-6 doi: 10.1016/j.visres.2011.12.012. 2012 Jan 05.

Predicting the psychophysical similarity of faces and non-face complex shapes by image-based measures

Yue X, Biederman I, Mangini MC, Malsburg Cv, Amir O.

Abstract

Shape representation is accomplished by a series of cortical stages in which cells in the first stage (V1) have local receptive fields tuned to contrast at a particular scale and orientation, each well modeled as a Gabor filter. In succeeding stages, the representation becomes largely invariant to Gabor coding (Kobatake & Tanaka, 1994). Because of the non-Gabor tuning in these later stages, which must be engaged for a behavioral response (Tong, 2003; Tong et al., 1998), a V1-based measure of shape similarity based on Gabor filtering would not be expected to be highly correlated with human performance when discriminating complex shapes (faces and teeth-like blobs) that differ metrically on a two-choice, match-to-sample task. Here we show that human performance is highly correlated with Gabor-based image measures (Gabor simple and complex cells), with values often in the mid 0.90s, even without discounting the variability in the speed and accuracy of performance not associated with the similarity of the distractors. This high correlation is generally maintained through the stages of HMAX, a model that builds upon the Gabor metric and develops units for complex features and larger receptive fields. This is the first report of the psychophysical similarity of complex shapes being predictable from a biologically motivated, physical measure of similarity. As accurate as these measures were for accounting for metric variation, a simple demonstration showed that all were insensitive to viewpoint invariant (nonaccidental) differences in shape.

PMID: 22248730