Deep learning: the good, the bad, and the ugly

T Serre - Annual review of vision science, 2019 - annualreviews.org
Artificial vision has often been described as one of the key remaining challenges to be
solved before machines can act intelligently. Recent developments in a branch of machine …

[图书][B] The neural bases of multisensory processes

MM Murray, MT Wallace - 2011 - taylorfrancis.com
It has become accepted in the neuroscience community that perception and performance
are quintessentially multisensory by nature. Using the full palette of modern brain imaging …

Matnet: Motion-attentive transition network for zero-shot video object segmentation

T Zhou, J Li, S Wang, R Tao… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
In this paper, we present a novel end-to-end learning neural network, ie, MATNet, for zero-
shot video object segmentation (ZVOS). Motivated by the human visual attention behavior …

Contrastive learning explains the emergence and function of visual category-selective regions

JS Prince, GA Alvarez, T Konkle - Science Advances, 2024 - science.org
Modular and distributed coding theories of category selectivity along the human ventral
visual stream have long existed in tension. Here, we present a reconciling framework …

Learning features by watching objects move

D Pathak, R Girshick, P Dollár… - Proceedings of the …, 2017 - openaccess.thecvf.com
This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired
by the human visual system, we explore whether low-level motion-based grouping cues can …

Video (language) modeling: a baseline for generative models of natural videos

MA Ranzato, A Szlam, J Bruna, M Mathieu… - arXiv preprint arXiv …, 2014 - arxiv.org
We propose a strong baseline model for unsupervised feature learning using video data. By
learning to predict missing frames or extrapolate future frames from an input video …

Segmentation of moving objects by long term video analysis

P Ochs, J Malik, T Brox - IEEE transactions on pattern analysis …, 2013 - ieeexplore.ieee.org
Motion is a strong cue for unsupervised object-level grouping. In this paper, we demonstrate
that motion will be exploited most effectively, if it is regarded over larger time windows …

[HTML][HTML] Development of human visual function

O Braddick, J Atkinson - Vision research, 2011 - Elsevier
By 1985 newly devised behavioural and electrophysiological techniques had been used to
track development of infants' acuity, contrast sensitivity and binocularity, and for clinical …

Reading with sounds: sensory substitution selectively activates the visual word form area in the blind

E Striem-Amit, L Cohen, S Dehaene, A Amedi - Neuron, 2012 - cell.com
Using a visual-to-auditory sensory-substitution algorithm, congenitally fully blind adults were
taught to read and recognize complex images using" soundscapes"—sounds …

Learning what and where to attend

D Linsley, D Shiebler, S Eberhardt, T Serre - arXiv preprint arXiv …, 2018 - arxiv.org
Most recent gains in visual recognition have originated from the inclusion of attention
mechanisms in deep convolutional networks (DCNs). Because these networks are …