Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

SF Altschul, TL Madden, AA Schäffer… - Nucleic acids …, 1997 - academic.oup.com
The BLAST programs are widely used tools for searching protein and DNA databases for
sequence similarities. For protein comparisons, a variety of definitional, algorithmic and …

Sequence analysis: New methods for old ideas

A Abbott - Annual review of sociology, 1995 - annualreviews.org
A wide variety of work in social science concerns sequences of events or phenomena. This
essay reviews concepts of sequence and methods for analyzing sequences. After a brief …

Intron retention is a widespread mechanism of tumor-suppressor inactivation

H Jung, D Lee, J Lee, D Park, YJ Kim, WY Park… - Nature …, 2015 - nature.com
A substantial fraction of disease-causing mutations are pathogenic through aberrant
splicing. Although genome profiling studies have identified somatic single-nucleotide …

CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix …

JD Thompson, DG Higgins, TJ Gibson - Nucleic acids research, 1994 - academic.oup.com
The sensitivity of the commonly used progressive multiple sequence alignment method has
been greatly improved for the alignment of divergent protein sequences. Firstly, individual …

MUSCLE: a multiple sequence alignment method with reduced time and space complexity

RC Edgar - BMC bioinformatics, 2004 - Springer
Background In a previous paper, we introduced MUSCLE, a new program for creating
multiple alignments of protein sequences, giving a brief summary of the algorithm and …

T-Coffee: A novel method for fast and accurate multiple sequence alignment

C Notredame, DG Higgins, J Heringa - Journal of molecular biology, 2000 - Elsevier
We describe a new method (T-Coffee) for multiple sequence alignment that provides a
dramatic improvement in accuracy with a modest sacrifice in speed as compared to the most …

[PDF][PDF] Fitting a mixture model by expectation maximization to discover motifs in bipolymers

TL Bailey, C Elkan - 1994 - cs.toronto.edu
The algorithm described in this paper discovers one or more motifs in a collection of DNA or
protein sequences by using the technique of expectation maximization to fit a two …

PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences

M Lescot, P Déhais, G Thijs, K Marchal… - Nucleic acids …, 2002 - academic.oup.com
PlantCARE is a database of plant cis-acting regulatory elements, enhancers and repressors.
Regulatory elements are represented by positional matrices, consensus sequences and …

Springer series in statistics

P Bickel, P Diggle, S Fienberg, U Gather, I Olkin… - Principles and Theory …, 2009 - Springer
The idea for this book came from the time the authors spent at the Statistics and Applied
Mathematical Sciences Institute (SAMSI) in Research Triangle Park in North Carolina …

[图书][B] Biological sequence analysis: probabilistic models of proteins and nucleic acids

R Durbin, SR Eddy, A Krogh, G Mitchison - 1998 - books.google.com
Probabilistic models are becoming increasingly important in analysing the huge amount of
data being produced by large-scale DNA-sequencing efforts such as the Human Genome …