Capturing the objects of vision with neural networks

B Peters, N Kriegeskorte - Nature human behaviour, 2021 - nature.com
Human visual perception carves a scene at its physical joints, decomposing the world into
objects, which are selectively attended, tracked and predicted as we engage our …

A survey on long-tailed visual recognition

L Yang, H Jiang, Q Song, J Guo - International Journal of Computer Vision, 2022 - Springer
The heavy reliance on data is one of the major reasons that currently limit the development
of deep learning. Data quality directly dominates the effect of deep learning models, and the …

On the binding problem in artificial neural networks

K Greff, S Van Steenkiste, J Schmidhuber - arXiv preprint arXiv …, 2020 - arxiv.org
Contemporary neural networks still fall short of human-level generalization, which extends
far beyond our direct experiences. In this paper, we argue that the underlying cause for this …

Remix: rebalanced mixup

HP Chou, SC Chang, JY Pan, W Wei… - Computer Vision–ECCV …, 2020 - Springer
Deep image classifiers often perform poorly when training data are heavily class-
imbalanced. In this work, we propose a new regularization technique, Remix, that relaxes …

An investigation into pre-training object-centric representations for reinforcement learning

J Yoon, YF Wu, H Bae, S Ahn - arXiv preprint arXiv:2302.04419, 2023 - arxiv.org
Unsupervised object-centric representation (OCR) learning has recently drawn attention as
a new paradigm of visual representation. This is because of its potential of being an effective …

Generative neurosymbolic machines

J Jiang, S Ahn - Advances in Neural Information Processing …, 2020 - proceedings.neurips.cc
Reconciling symbolic and distributed representations is a crucial challenge that can
potentially resolve the limitations of current deep learning. Remarkable advances in this …

Enhancing the transformer with explicit relational encoding for math problem solving

I Schlag, P Smolensky, R Fernandez, N Jojic… - arXiv preprint arXiv …, 2019 - arxiv.org
We incorporate Tensor-Product Representations within the Transformer in order to better
support the explicit representation of relation structure. Our Tensor-Product Transformer (TP …

Roots: Object-centric representation and rendering of 3d scenes

C Chen, F Deng, S Ahn - Journal of Machine Learning Research, 2021 - jmlr.org
A crucial ability of human intelligence is to build up models of individual 3D objects from
partial scene observations. Recent works either achieve object-centric generation but …

Unsupervised object keypoint learning using local spatial predictability

A Gopalakrishnan, S van Steenkiste… - arXiv preprint arXiv …, 2020 - arxiv.org
We propose PermaKey, a novel approach to representation learning based on object
keypoints. It leverages the predictability of local image regions from spatial neighborhoods …

R-sqair: Relational sequential attend, infer, repeat

A Stanić, J Schmidhuber - arXiv preprint arXiv:1910.05231, 2019 - arxiv.org
Traditional sequential multi-object attention models rely on a recurrent mechanism to infer
object relations. We propose a relational extension (R-SQAIR) of one such attention model …