Pfam, available via servers in the UK (http://pfam. sanger. ac. uk/) and the USA (http://pfam. janelia. org/), is a widely used database of protein families, containing 14 831 manually …
Language models have recently emerged as a powerful machine-learning approach for distilling information from massive protein sequence databases. From readily available …
In the field of artificial intelligence, a combination of scale in data and model capacity enabled by unsupervised learning has led to major advances in representation learning and …
Microbes drive most ecosystems and are modulated by viruses that impact their lifespan, gene flow, and metabolic outputs. However, ecosystem-level impacts of viral community …
M Mascher, H Gundlach, A Himmelbach, S Beier… - Nature, 2017 - nature.com
Cereal grasses of the Triticeae tribe have been the major food source in temperate regions since the dawn of agriculture. Their large genomes are characterized by a high content of …
Generative modeling for protein engineering is key to solving fundamental problems in synthetic biology, medicine, and material science. We pose protein engineering as an …
Anaerobic oxidation of methane (AOM) is a major biological process that reduces global methane emission to the atmosphere. Anaerobic methanotrophic archaea (ANME) mediate …
Phylogenetic tree confidence is often estimated from a multiple sequence alignment (MSA) using the Felsenstein bootstrap heuristic. However, this does not account for systematic …
Background System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective …