Researchers developed Prov-GigaPath, a whole-slide pathology foundation model using a novel vision transformer architecture. The model demonstrates superior performance in mutation prediction, cancer subtyping, and vision-language tasks. It leverages large-scale real-world data from over 30,000 patients to enhance clinical diagnostics.
Imagine if an artificial intelligence (AI) model could learn language just like a child does by seeing and hearing the world through their eyes and ears.
Researchers at NYU trained a CVCL model to link words with visual cues. The model was trained on data from a headset worn by a toddler and containing about 60 hours of footage.