%0 Journal Article %J bioRxiv %D 2021 %T Multi-scale hierarchical neural network models that bridge from single neurons in the primate primary visual cortex to object recognition behavior %A Tiago Marques %A Martin Schrimpf %A James J DiCarlo %X Object recognition relies on inferior temporal (IT) cortical neural population representations that are themselves computed by a hierarchical network of feedforward and recurrently connected neural population called the ventral visual stream (areas V1, V2, V4 and IT). While recent work has created some reasonably accurate image-computable hierarchical neural network models of those neural stages, those models do not yet bridge between the properties of individual neurons and the overall emergent behavior of the ventral stream. For example, current leading ventral stream models do not allow us to ask questions such as: How does the surround suppression behavior of individual V1 neurons ultimately relate to IT neural representation and to behavior?; or How would deactivation of a particular sub-population of V1 neurons specifically alter object recognition behavior? One reason we cannot yet do this is that individual V1 artificial neurons in multi-stage models have not been shown to be functionally similar with individual biological V1 neurons. Here, we took an important first step towards this direction by building and evaluating hundreds of hierarchical neural network models in how well their artificial single neurons approximate macaque primary visual cortical (V1) neurons. We found that single neurons in some models are surprisingly similar to their biological counterparts and that the distributions of single neuron properties, such as those related to orientation and spatial frequency tuning, approximately match those in macaque V1. Crucially, we also observed that hierarchical models with V1-layers that better match macaque V1 at the single neuron level are also more aligned with human object recognition behavior. These results provide the first multi-stage, multi-scale models that allow our field to ask precisely how the specific properties of individual V1 neurons relate to recognition behavior. Finally, we here show that an optimized classical neuroscientific model of V1 is still more functionally similar to primate V1 than all of the tested multi-stage models, suggesting that further model improvements are possible, and that those improvements would likely have tangible payoffs in terms of behavioral prediction accuracy and behavioral robustness.Single neurons in some image-computable hierarchical neural network models are functionally similar to single neurons in macaque primate visual cortex (V1)Some hierarchical neural networks models have V1 layers that better match the biological distributions of macaque V1 single neuron response propertiesMulti-stage hierarchical neural network models with V1 stages that better match macaque V1 are also more aligned with human object recognition behavior at their output stageCompeting Interest StatementThe authors have declared no competing interest. %B bioRxiv %8 03/01/2021 %G eng %9 preprint %R 10.1101/2021.03.01.433495 %0 Conference Paper %B Neural Information Processing Systems %D 2019 %T Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs %A Jonas Kubilius %A Martin Schrimpf %A Ha Hong %A Najib Majaj %A Rajalingham, Rishi %A Issa, Elias B. %A Kohitij Kar %A Bashivan, Pouya %A Jonathan Prescott-Roy %A Kailyn Schmidt %A Aran Nayebi %A Daniel Bear %A Daniel L. K. Yamins %A James J. DiCarlo %X

Deep convolutional artificial neural networks (ANNs) are the leading class of candidate models of the mechanisms of visual processing in the primate ventral stream. While initially inspired by brain anatomy, over the past years, these ANNs have evolved from a simple eight-layer architecture in AlexNet to extremely deep and branching architectures, demonstrating increasingly better object categorization performance, yet bringing into question how brain-like they still are. In particular, typical deep models from the machine learning community are often hard to map onto the brain's anatomy due to their vast number of layers and missing biologically-important connections, such as recurrence. Here we demonstrate that better anatomical alignment to the brain and high performance on machine learning as well as neuroscience measures do not have to be in contradiction. We developed CORnet-S, a shallow ANN with four anatomically mapped areas and recurrent connectivity, guided by Brain-Score, a new large-scale composite of neural and behavioral benchmarks for quantifying the functional fidelity of models of the primate ventral visual stream. Despite being significantly shallower than most models, CORnet-S is the top model on Brain-Score and outperforms similarly compact models on ImageNet. Moreover, our extensive analyses of CORnet-S circuitry variants reveal that recurrence is the main predictive factor of both Brain-Score and ImageNet top-1 performance. Finally, we report that the temporal evolution of the CORnet-S "IT" neural population resembles the actual monkey IT population dynamics. Taken together, these results establish CORnet-S, a compact, recurrent ANN, as the current best model of the primate ventral visual stream.

%B Neural Information Processing Systems %G eng %U https://papers.nips.cc/paper/9441-brain-like-object-recognition-with-high-performing-shallow-recurrent-anns.pdf %R https://papers.nips.cc/paper/9441-brain-like-object-recognition-with-high-performing-shallow-recurrent-anns %0 Journal Article %J BioRxiv %D 2019 %T To find better neural network models of human vision, find better neural network models of primate vision %A Kamila M. Jozwik %A Martin Schrimpf %A Nancy Kanwisher %A James J. DiCarlo %X

Specific deep artificial neural networks (ANNs) are the current best models of ventral visual processing and object recognition behavior in monkeys. We here explore whether models of non-human primate vision generalize to visual processing in the human primate brain. Specifically, we asked if model match to monkey IT is a predictor of model match to human IT, even when scoring those matches on different images. We found that the model match to monkey IT is a positive predictor of the model match to human IT (R = 0.36), and that this approach outperforms the current standard predictor of model accuracy on ImageNet. This suggests a more powerful approach for pre-selecting models as hypotheses of human brain processing.

%B BioRxiv %8 07/2019 %G eng %U https://www.biorxiv.org/content/10.1101/688390v1.full.pdf %9 preprint %R https://doi.org/10.1101/688390 %0 Journal Article %J bioRxiv %D 2018 %T Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like? %A Martin Schrimpf %A Kubilius, Jonas %A Ha Hong %A Najib Majaj %A Rajalingham, Rishi %A Issa, Elias B. %A Kar, Kohitij %A Bashivan, Pouya %A Jonathan Prescott-Roy %A Schmidt, Kailyn %A Daniel L. K. Yamins %A DiCarlo, James J. %X

The internal representations of early deep artificial neural networks (ANNs) were found to be remarkably similar to the internal neural representations measured experimentally in the primate brain. Here we ask, as deep ANNs have continued to evolve, are they becoming more or less brain-like? ANNs that are most functionally similar to the brain will contain mechanisms that are most like those used by the brain. We therefore developed Brain-Score - a composite of multiple neural and behavioral benchmarks that score any ANN on how similar it is to the brain's mechanisms for core object recognition - and we deployed it to evaluate a wide range of state-of-the-art deep ANNs. Using this scoring system, we here report that: (1) DenseNet-169, CORnet-S and ResNet-101 are the most brain-like ANNs. (2) There remains considerable variability in neural and behavioral responses that is not predicted by any ANN, suggesting that no ANN model has yet captured all the relevant mechanisms. (3) Extending prior work, we found that gains in ANN ImageNet performance led to gains on Brain-Score. However, correlation weakened at >= 70% top-1 ImageNet performance, suggesting that additional guidance from neuroscience is needed to make further advances in capturing brain mechanisms. (4) We uncovered smaller (i.e. less complex) ANNs that are more brain-like than many of the best-performing ImageNet models, which suggests the opportunity to simplify ANNs to better understand the ventral stream. The scoring system used here is far from complete. However, we propose that evaluating and tracking model-benchmark correspondences through a Brain-Score that is regularly updated with new brain data is an exciting opportunity: experimental benchmarks can be used to guide machine network evolution, and machine networks are mechanistic hypotheses of the brain's network and thus drive next experiments. To facilitate both of these, we release Brain-Score.org: a platform that hosts the neural and behavioral benchmarks, where ANNs for visual processing can be submitted to receive a Brain-Score and their rank relative to other models, and where new experimental data can be naturally incorporated.

%B bioRxiv %8 09/2018 %G eng %U https://www.biorxiv.org/content/10.1101/407007v2.full.pdf %9 preprint %R https://doi.org/10.1101/407007 %0 Journal Article %J bioRxiv %D 2018 %T CORnet: Modeling the Neural Mechanisms of Core Object Recognition %A Kubilius, Jonas %A Martin Schrimpf %A Aran Nayebi %A Daniel Bear %A Daniel L. K. Yamins %A DiCarlo, James J. %X

Deep artificial neural networks with spatially repeated processing (a.k.a., deep convolutional ANNs) have been established as the best class of candidate models of visual processing in primate ventral visual processing stream. Over the past five years, these ANNs have evolved from a simple feedforward eight-layer architecture in AlexNet to extremely deep and branching NASNet architectures, demonstrating increasingly better object categorization performance and increasingly better explanatory power of both neural and behavioral responses. However, from the neuroscientist's point of view, the relationship between such very deep architectures and the ventral visual pathway is incomplete in at least two ways. On the one hand, current state-of-the-art ANNs appear to be too complex (e.g., now over 100 levels) compared with the relatively shallow cortical hierarchy (4-8 levels), which makes it difficult to map their elements to those in the ventral visual stream and to understand what they are doing. On the other hand, current state-of-the-art ANNs appear to be not complex enough in that they lack recurrent connections and the resulting neural response dynamics that are commonplace in the ventral visual stream. Here we describe our ongoing efforts to resolve both of these issues by developing a "CORnet" family of deep neural network architectures. Rather than just seeking high object recognition performance (as the state-of-the-art ANNs above), we instead try to reduce the model family to its most important elements and then gradually build new ANNs with recurrent and skip connections while monitoring both performance and the match between each new CORnet model and a large body of primate brain and behavioral data. We report here that our current best ANN model derived from this approach (CORnet-S) is among the top models on Brain-Score, a composite benchmark for comparing models to the brain, but is simpler than other deep ANNs in terms of the number of convolutions performed along the longest path of information processing in the model. All CORnet models are available at https://github.com/dicarlolab/CORnet, and we plan to update this manuscript and the available models in this family as they are produced.

%B bioRxiv %8 09/2018 %G eng %U https://www.biorxiv.org/content/10.1101/408385v1.full.pdf %9 preprint %R https://doi.org/10.1101/408385