Motivation
The Protein Data Bank currently contains more than 4700 protein coordinate sets. It is often desirable to make a selection from these files based on a criterion like R-factor, experimental method, length of the amino acid sequence, or the number of homologous sequences in swissprot. Doing this using the distributed form of the Protein Data Bank can be a tedious task, because (1) this requires reading one file for every single entry, and (2) not all of the information is present in a consistent computer readable way in all of the entries.
Results
The pdbfinder database provides an easy to interpret file containing summary information about all Protein Data Bank files. Summary information from the dssp (Definition of Secondary Structure of Proteins) and hssp (Homology derived Secondary Structure of Proteins) databases is also included. Furthermore, where essential data were missing from the Protein Data Bank file, this information has been retrieved from the original literature.
Availability
The latest version of the pdbfinder database can be downloaded by anonymous ftp from swift. embl-heidelberg.de, directory: /pdbfinder.