Survey of protein sequence embedding models

C Tran, S Khadkikar, A Porollo - International Journal of Molecular …, 2023 - mdpi.com
Derived from the natural language processing (NLP) algorithms, protein language models
enable the encoding of protein sequences, which are widely diverse in length and amino …

In the twilight zone of protein sequence homology: do protein language models learn protein structure?

A Kabir, A Moldwin, Y Bromberg… - Bioinformatics …, 2024 - academic.oup.com
Motivation Protein language models based on the transformer architecture are increasingly
improving performance on protein prediction tasks, including secondary structure …

Toward universal cell embeddings: integrating single-cell RNA-seq datasets across species with SATURN

Y Rosen, M Brbić, Y Roohani, K Swanson, Z Li… - Nature …, 2024 - nature.com
Abstract Analysis of single-cell datasets generated from diverse organisms offers
unprecedented opportunities to unravel fundamental evolutionary processes of conservation …

Tripartite interaction of a patescibacterial epibiont, a methylotrophic gammaproteobacterial host and a jumbo phage

F Bouderka, P López-García, P Deschamps… - bioRxiv, 2024 - biorxiv.org
Patescibacteria form a very diverse and widely distributed phylum of small bacteria inferred
to have an episymbiotic lifestyle. However, the prevalence of this lifestyle within the phylum …