MIT neuroscientists see design flaws in computer vision tests

January 24, 2008

For years, scientists have been trying to teach computers how to see like humans, and recent research has seemed to show computers making progress in recognizing visual objects. A new MIT study, however, cautions that this apparent success may be misleading because the tests being used are inadvertently stacked in favor of computers.

Computer vision is important for applications ranging from "intelligent" cars to visual prosthetics for the blind. Recent computational models show apparently impressive progress, boasting 60-percent success rates in classifying natural photographic image sets. These include the widely used Caltech101 database, intended to test computer vision algorithms against the variety of images seen in the real world.

However, James DiCarlo, a neuroscientist in the McGovern Institute for Brain Research at MIT, graduate student Nicolas Pinto and David Cox of the Rowland Institute at Harvard argue that these image sets have design flaws that enable computers to succeed where they would fail with more-authentically varied images. For example, photographers tend to center objects in a frame and to prefer certain views and contexts. The visual system, by contrast, encounters objects in a much broader range of conditions. Read More...