We estimate the global BOLD Systems database holds core DNA barcodes (rbcL + matK) for about 15% of land plant species and that comprehensive species coverage is still many decades away. Interim performance of the resource is compromised by variable sequence overlap and modest information content within each barcode. Our model predicts that the proportion of species-unique barcodes reduces as the database grows and that ‘false’ species-unique barcodes remain >5% until the database is almost complete. We conclude the current rbcL + matK barcode is unfit for purpose. Genome skimming and supplementary barcodes could improve diagnostic power but would slow new barcode acquisition. We therefore present two novel Next Generation Sequencing protocols (with freeware) capable of accurate, massively parallel de novo assembly of high quality DNA barcodes of >1400 bp. We explore how these capabilities could enhance species diagnosis in the coming decades
|Publication status||Published - 12 Apr 2017|
- computational biology and bioinformatics
- plant science
FingerprintDive into the research topics of 'Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants'. Together they form a unique fingerprint.
- Faculty of Earth and Life Sciences, Department of Life Sciences - Chair in Upland Agroecosystems
Person: Teaching And Research