Crynodeb
Poly-transformation is the extension of the idea of ensemble learning to the transformation step of Knowledge Discovery in Databases (KDD). In poly-transformation multiple transformations of the data are made before learning (data mining) is applied. The theoretical basis for poly-transformation is the same as that for other combining methods – using different predictors to remove uncorrelated errors. It is not possible to demonstrate the utility of poly-transformation using standard datasets, because no pre-transformed data exists for such datasets. We therefore demonstrate its utility by applying it to a single well-known hard problem for which we have expertise - the problem of predicting protein secondary structure from primary structure. We applied four different transformations of the data, each of which was justifiable by biological background knowledge. We then applied four different learning methods (linear discrimination, back-propagation, C5.0, and learning vector quantization) both to the four transformations, and to combining predictions from the different transformations to form the poly-transformation predictions. Each of the learning methods produced significantly higher accuracy with poly-transformation than with only a single transformation. Poly-transformation is the basis of the secondary structure prediction method Prof, which is one of the most accurate existing methods for this problem.
Iaith wreiddiol | Saesneg |
---|---|
Tudalennau | 99-107 |
Nifer y tudalennau | 9 |
Statws | Cyhoeddwyd - 2004 |