Crynodeb
Genome sequencing technology is generating large databases of sequence at such a rate that advances in computer hardware alone are not adequate to handle them: more efficient algorithms are needed. Here an alignment-free method of sequence comparison and visualisation based on the Chaos Games Representation (CGR) and multifractal analysis is explored as an approach to search and filter through a data set of over 1500 microbial genomes. Whereas BLAST takes 25 hours to search this data set with large sequence fragments (e.g. 100 Kb), the method introduced here can reduce this data set by 95% (from 1550 target species to just 50) in about 15 minutes, and it is able to predict the exact species correctly in 67% of cases. The results presented here demonstrate that CGR is worth further investigation as a fast method to perform genome sequence comparison on large data sets, and various ways to further develop the method are discussed.
Iaith wreiddiol | Saesneg |
---|---|
Tudalennau (o-i) | 1372-1381 |
Nifer y tudalennau | 9 |
Cyfnodolyn | Procedia Computer Science |
Cyfrol | 18 |
Dynodwyr Gwrthrych Digidol (DOIs) | |
Statws | Cyhoeddwyd - 01 Meh 2013 |
Digwyddiad | 2013 International Conference on Computational Science - Barcelona, Sbaen Hyd: 05 Meh 2013 → 07 Meh 2013 |
Ôl bys
Gweld gwybodaeth am bynciau ymchwil 'Fast Comparison of Microbial Genomes Using the Chaos Games Representation for Metagenomic Applications'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.Setiau Data
-
Microbial genome sequences and taxonomic information based on the Genometa 2012 data set
Swain, M., Prifysgol Aberystwyth | Aberystwyth University, 18 Gorff 2019
Dangosydd eitem ddigidol (DOI): 10.20391/e6974906-f30f-4976-90fb-ea1679eedef0
Set ddata
Ffeil