TY - JOUR
T1 - Prediction of the conformation and geometry of loops in globular proteins
T2 - testing ArchDB, a structural classification of loops
AU - Fernandez-Fuentes, Narcis
AU - Querol, Enrique
AU - Aviles, Francesc X.
AU - Sternberg, Michael j. E.
AU - Oliva, Baldomero
N1 - Copyright 2005 Wiley-Liss, Inc.
PY - 2005/9/1
Y1 - 2005/9/1
N2 - In protein structure prediction, a central problem is defining the structure of a loop connecting 2 secondary structures. This problem frequently occurs in homology modeling, fold recognition, and in several strategies in ab initio structure prediction. In our previous work, we developed a classification database of structural motifs, ArchDB. The database contains 12,665 clustered loops in 451 structural classes with information about phi-psi angles in the loops and 1492 structural subclasses with the relative locations of the bracing secondary structures. Here we evaluate the extent to which sequence information in the loop database can be used to predict loop structure. Two sequence profiles were used, a HMM profile and a PSSM derived from PSI-BLAST. A jack-knife test was made removing homologous loops using SCOP superfamily definition and predicting afterwards against recalculated profiles that only take into account the sequence information. Two scenarios were considered: (1) prediction of structural class with application in comparative modeling and (2) prediction of structural subclass with application in fold recognition and ab initio. For the first scenario, structural class prediction was made directly over loops with X-ray secondary structure assignment, and if we consider the top 20 classes out of 451 possible classes, the best accuracy of prediction is 78.5%. In the second scenario, structural subclass prediction was made over loops using PSI-PRED (Jones, J Mol Biol 1999;292:195-202) secondary structure prediction to define loop boundaries, and if we take into account the top 20 subclasses out of 1492, the best accuracy is 46.7%. Accuracy of loop prediction was also evaluated by means of RMSD calculations.
AB - In protein structure prediction, a central problem is defining the structure of a loop connecting 2 secondary structures. This problem frequently occurs in homology modeling, fold recognition, and in several strategies in ab initio structure prediction. In our previous work, we developed a classification database of structural motifs, ArchDB. The database contains 12,665 clustered loops in 451 structural classes with information about phi-psi angles in the loops and 1492 structural subclasses with the relative locations of the bracing secondary structures. Here we evaluate the extent to which sequence information in the loop database can be used to predict loop structure. Two sequence profiles were used, a HMM profile and a PSSM derived from PSI-BLAST. A jack-knife test was made removing homologous loops using SCOP superfamily definition and predicting afterwards against recalculated profiles that only take into account the sequence information. Two scenarios were considered: (1) prediction of structural class with application in comparative modeling and (2) prediction of structural subclass with application in fold recognition and ab initio. For the first scenario, structural class prediction was made directly over loops with X-ray secondary structure assignment, and if we consider the top 20 classes out of 451 possible classes, the best accuracy of prediction is 78.5%. In the second scenario, structural subclass prediction was made over loops using PSI-PRED (Jones, J Mol Biol 1999;292:195-202) secondary structure prediction to define loop boundaries, and if we take into account the top 20 subclasses out of 1492, the best accuracy is 46.7%. Accuracy of loop prediction was also evaluated by means of RMSD calculations.
KW - loop structure prediction
KW - fold recognition
KW - comparative modeling
KW - sequence profiles
UR - http://hdl.handle.net/2160/9010
U2 - 10.1002/prot.20516
DO - 10.1002/prot.20516
M3 - Article
C2 - 16021623
SN - 1878-1454
VL - 60
SP - 746
EP - 757
JO - Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics
JF - Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics
IS - 4
ER -