This paper introduced the first machine learned "encyclopedia'" of the common protein-fold structures. The Journal of Molecular Biology is the premier journal in this area.
The study of protein structure has largely been driven by the careful inspection of experimental data by human experts. However, the rapid production of protein structures from structural-genomics projects will make it increasingly difficult to analyse (and determine the principles responsible for) the distribution of proteins in fold space by inspection alone. Here, we demonstrate a machine-learning strategy that automatically determines the structural principles describing 45 classes of fold. The rules learnt were shown to be both statistically significant and meaningful to protein experts. With the increasing emphasis on high-throughput experimental initiatives, machine-learning and other automated methods of analysis will become increasingly important for many biological problems.
pubs.doc.ic.ac.uk: built & maintained by Ashok Argent-Katwala.