The DiCarlo Lab at MIT

Working to discover the neuronal algorithms underlying visual object recognition


How do you recognize the items on your desk? The faces of your loved ones? The words on this page? Our research goal is to understand the neuronal algorithms and circuits that underlie visual object recognition – an understanding that might help change the world. Concretely, we seek to understand how the visual system transforms each image from an initial, pixel-like representation, to a new, remarkably powerful form of representation – one that can support our seemingly effortless ability to solve these object recognition tasks in the real world. We are focussed on the crux “invariance” problem – the ability to distinguish among objects despite dramatic image variation; [1],[2]. To approach this very difficult problem, the work of our research group is directed along three main lines:

Elucidating Neuronal Object Codes

One key direction is to experimentally measure and analyze the patterns of neuronal spiking activity (“codes”) found at the highest levels of the ventral visual stream (primate inferior temporal cortex, IT). At this high level, those neuronal codes have solved the “invariance” problem [3],[4]. While one should not be surprised that such codes exist in the brain, their discovery and continued deeper understanding enables us to focus on the algorithms that construct the codes.

The Quest for Underlying Algorithms

Discovering the key algorithms requires a tight interplay between experiment and theory. For example, we recently discovered that the key invariance properties of neuronal object codes are plastic and can be built from unsupervised, natural visual experience. To explore the potential power of such ideas, we and our collaborators implement and screen large families of brain-constrained models and test them on real-world problems. More generally, we are building a systematic foundation to bring together neuronal data, mechanistic models, and human recognition performance.

The Circuits that Implement those Algorithms

Clever computational algorithms do not exist in a vacuum, but must be implemented in specific neuronal circuits in the brain tissue. We employ high resolution MR and fMRI imaging, microfocal stereo x-ray methods, and optogenetic tools to understand the spatial layout of those circuits in the ventral visual cortex. This information will provide clues about the algorithms at work. It will also allow us to interact with those neuronal circuits to both test hypotheses and potentially enable new brain machine interfaces.

Why do we do this research?

Because recognition is critical to so much of behavior, the understanding we seek will fundamentally drive the way we think about how the brain processes sensory information into a format that is highly suited for cognition, decision and action. Our goals are to use this understanding to inspire and develop powerful artificial vision systems, to aid the development of visual prosthetics, and to provide guidance to molecular approaches to repair lost brain function.

Because the key cortical circuitry is similar in all sensory brain areas, the computational algorithms we aim to discover may facilitate the understanding of how the brain processes other sensory data, such as tactile and auditory information. Similarly, this research has the potential to expose computational strategies that can be abstracted away from the confines of our own sensory apparatus – potentially enabling new forms of intelligence working along side us.