Information theory applications for biological sequence analysis

S Vinga - Briefings in bioinformatics, 2014 - academic.oup.com
Abstract Information theory (IT) addresses the analysis of communication systems and has
been widely applied in molecular biology. In particular, alignment-free sequence analysis …

Solving the protein sequence metric problem

WR Atchley, J Zhao, AD Fernandes… - Proceedings of the …, 2005 - National Acad Sciences
Biological sequences are composed of long strings of alphabetic letters rather than arrays of
numerical values. Lack of a natural underlying metric for comparing such alphabetic data …

Self-supervised contrastive learning of protein representations by mutual information maximization

AX Lu, H Zhang, M Ghassemi, A Moses - BioRxiv, 2020 - biorxiv.org
Pretrained embedding representations of biological sequences which capture meaningful
properties can alleviate many problems associated with supervised learning in biology. We …

Molecular evolution of the GATA family of transcription factors: conservation within the DNA-binding domain

JA Lowry, WR Atchley - Journal of molecular evolution, 2000 - Springer
The GATA-binding transcription factors comprise a protein family whose members contain
either one or two highly conserved zinc finger DNA-binding domains. Members of this group …

Positional dependence, cliques, and predictive motifs in the bHLH protein domain

WR Atchley, W Terhalle, A Dress - Journal of molecular evolution, 1999 - Springer
Quantitative analyses were carried out on a large number of proteins that contain the highly
conserved basic helix–loop–helix domain. Measures derived from information theory were …

Correlations among amino acid sites in bHLH protein domains: an information theoretic analysis

WR Atchley, KR Wollenberg, WM Fitch… - Molecular biology …, 2000 - academic.oup.com
An information theoretic approach is used to examine the magnitude and origin of
associations among amino acid sites in the basic helix-loop-helix (bHLH) family of …

Information Theory, Living Systems, and Communication Engineering

D Bajić - Entropy, 2024 - mdpi.com
Mainstream research on information theory within the field of living systems involves the
application of analytical tools to understand a broad range of life processes. This paper is …

Survey on encoding schemes for genomic data representation and feature learning—from signal processing to machine learning

N Yu, Z Li, Z Yu - Big Data Mining and Analytics, 2018 - ieeexplore.ieee.org
Data-driven machine learning, especially deep learning technology, is becoming an
important tool for handling big data issues in bioinformatics. In machine learning, DNA …

Biochemical patterns of antibody polyreactivity revealed through a bioinformatics-based analysis of CDR loops

CT Boughter, MT Borowska, JJ Guthmiller, A Bendelac… - Elife, 2020 - elifesciences.org
Antibodies are critical components of adaptive immunity, binding with high affinity to
pathogenic epitopes. Antibodies undergo rigorous selection to achieve this high affinity, yet …

Defining diversity, specialization, and gene specificity in transcriptomes through information theory

O Martínez, MH Reyes-Valdés - Proceedings of the …, 2008 - National Acad Sciences
The transcriptome is a set of genes transcribed in a given tissue under specific conditions
and can be characterized by a list of genes with their corresponding frequencies of …