Self-supervised speech representation learning: A review

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

The neuroconnectionist research programme

A Doerig, RP Sommers, K Seeliger… - Nature Reviews …, 2023 - nature.com
Artificial neural networks (ANNs) inspired by biology are beginning to be widely used to
model behavioural and neural data, an approach we call 'neuroconnectionism'. ANNs have …

Masked autoencoders as spatiotemporal learners

C Feichtenhofer, Y Li, K He - Advances in neural …, 2022 - proceedings.neurips.cc
This paper studies a conceptually simple extension of Masked Autoencoders (MAE) to
spatiotemporal representation learning from videos. We randomly mask out spacetime …

Toward causal representation learning

B Schölkopf, F Locatello, S Bauer, NR Ke… - Proceedings of the …, 2021 - ieeexplore.ieee.org
The two fields of machine learning and graphical causality arose and are developed
separately. However, there is, now, cross-pollination and increasing interest in both fields to …

Contrastive and non-contrastive self-supervised learning recover global and local spectral embedding methods

R Balestriero, Y LeCun - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Abstract Self-Supervised Learning (SSL) surmises that inputs and pairwise positive
relationships are enough to learn meaningful representations. Although SSL has recently …

Bootstrap your own latent-a new approach to self-supervised learning

JB Grill, F Strub, F Altché, C Tallec… - Advances in neural …, 2020 - proceedings.neurips.cc
Abstract We introduce Bootstrap Your Own Latent (BYOL), a new approach to self-
supervised image representation learning. BYOL relies on two neural networks, referred to …

A large-scale study on unsupervised spatiotemporal representation learning

C Feichtenhofer, H Fan, B Xiong… - Proceedings of the …, 2021 - openaccess.thecvf.com
We present a large-scale study on unsupervised spatiotemporal representation learning
from videos. With a unified perspective on four recent image-based frameworks, we study a …

Spatio-temporal self-supervised representation learning for 3d point clouds

S Huang, Y Xie, SC Zhu, Y Zhu - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
To date, various 3D scene understanding tasks still lack practical and generalizable pre-
trained models, primarily due to the intricate nature of 3D scene understanding tasks and …

Data-efficient image recognition with contrastive predictive coding

O Henaff - International conference on machine learning, 2020 - proceedings.mlr.press
Human observers can learn to recognize new categories of images from a handful of
examples, yet doing so with artificial ones remains an open challenge. We hypothesize that …

Learning deep representations by mutual information estimation and maximization

RD Hjelm, A Fedorov, S Lavoie-Marchildon… - arXiv preprint arXiv …, 2018 - arxiv.org
In this work, we perform unsupervised learning of representations by maximizing mutual
information between an input and the output of a deep neural network encoder. Importantly …