Unsupervised natural experience rapidly alters invariant object representation in visual cortex

TitleUnsupervised natural experience rapidly alters invariant object representation in visual cortex
Publication TypeConference Proceedings
Year of Publication2008
AuthorsLi, N, DiCarlo, JJ
Conference NameSociety for Neuroscience
Date Published11/2008
Conference LocationWashington, DC, USA

The responses of cortical neurons at the top of the ventral visual stream -- inferior temporal (IT) cortex -- are selective to visual objects, yet tolerant (“invariant”) to changes in object position, size, pose, etc. Though IT responses likely underlie object recognition behavior, how that neuronal tolerance is constructed remains a fundamental mystery. One possibility is that natural visual experience is an implicit teacher: because objects are present for relatively long time intervals, while object motion or viewer motion (e.g. eye movements) cause rapid changes in each object’s retinal image, the visual system could learn tolerance by associating neuronal representations that occur closely in time. If this hypothesis is correct, then we might create “incorrect” tolerance by engineering an altered visual world in which we temporally couple the retinal images of two different objects at different retinal positions. The main prediction is that the visual system would incorrectly associate the representations of those objects at those positions. Thus, IT neurons might lose their position tolerant selectivity, and instead begin to prefer one object at one position and another object at the other position. To test this idea, two monkeys visually explored our altered visual world and we used real-time eye tracking to present visual objects at controlled retinal positions during free viewing. As the animal saccaded toward a specific object (P), it was consistently replaced by another object (N), rendering the image of P at a peripheral retinal position (“swapped”) temporally contiguous with the image of N on the fovea. We found that exposure to these altered statistics changed IT object selectivity specifically at the swapped position, as predicted. This unsupervised temporal tolerance learning (UTL) was substantial (~5 spk/s selectivity change in 1 hr), gradually increased with exposure, and was highly significant at the population level (p=0.007 “position x exposure” interaction, bootstrap). Coupled with the finding that this same experience manipulation changes the position tolerance of human object perception (Cox et al, 2005), we speculate that UTL may reflect the mechanism by which the visual system builds and maintains tolerant object representations. The relatively fast time-scale and unsupervised nature of UTL open the door to advances in systematically characterizing the spatiotemporal image statistics that drive it, understanding if it plays a role in other types of tolerance (e.g. pose, scale), and perhaps connecting a central cognitive ability -- tolerant object recognition -- to cellular and molecular plasticity mechanisms.