Abstract
Finding the location of binding sites in DNA is a difficult problem. Although the location of some binding sites have been experimentally identified, other parts of the genome may or may not contain binding sites. This poses problems with negative data in a trainable classifier. Here we show that using randomized negative data gives a large boost in classifier performance when compared to the original labeled data.
Original language | English |
---|---|
Pages | 523-527 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 12 Dec 2010 |
Externally published | Yes |
Event | 9th IEEE International Conference on Machine Learning and Applications (ICMLA) 2010 - Washington DC, United States of America Duration: 12 Dec 2010 → 14 Dec 2010 |
Conference
Conference | 9th IEEE International Conference on Machine Learning and Applications (ICMLA) 2010 |
---|---|
Abbreviated title | ICMLA 2010 |
Country/Territory | United States of America |
City | Washington DC |
Period | 12 Dec 2010 → 14 Dec 2010 |
Keywords
- Binding site
- Classification
- Genes
- Support
- Vector machines