Multimodal Neurons in Artificial Neural Networks
March 4, 2021
22 minute read
We’ve discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, or conceptually. This may explain CLIP s accuracy in classifying surprising visual renditions of concepts, and is also an important step toward understanding the associations and biases that CLIP and similar models learn.
Fifteen years ago, Quiroga et al. discovered that the human brain possesses multimodal neurons. These neurons respond to clusters of abstract concepts centered around a common high-level theme, rather than any specific visual feature. The most famous of these was the “Halle Berry” neuron, a neuron featured in both Scientific American and The New York Times, that responds to photographs, sketches, and the text “Halle Berry” (but not other names).