Academic Journals Database
Disseminating quality controlled scientific knowledge

Rapid DNA barcoding analysis of large datasets using the composition vector method

ADD TO MY LIST
 
Author(s): Chu Ka | Xu Minli | Li Chi

Journal: BMC Bioinformatics
ISSN 1471-2105

Volume: 10;
Issue: Suppl 14;
Start page: S8;
Date: 2009;
VIEW PDF   PDF DOWNLOAD PDF   Download PDF Original page

ABSTRACT
Abstract Background Sequence alignment is the rate-limiting step in constructing profile trees for DNA barcoding purposes. We recently demonstrated the feasibility of using unaligned rRNA sequences as barcodes based on a composition vector (CV) approach without sequence alignment (Bioinformatics 22:1690). Here, we further explored the grouping effectiveness of the CV method in large DNA barcode datasets (COI, 18S and 16S rRNA) from a variety of organisms, including birds, fishes, nematodes and crustaceans. Results Our results indicate that the grouping of taxa at the genus/species levels based on the CV/NJ approach is invariably consistent with the trees generated by traditional approaches, although in some cases the clustering among higher groups might differ. Furthermore, the CV method is always much faster than the K2P method routinely used in constructing profile trees for DNA barcoding. For instance, the alignment of 754 COI sequences (average length 649 bp) from fishes took more than ten hours to complete, while the whole tree construction process using the CV/NJ method required no more than five minutes on the same computer. Conclusion The CV method performs well in grouping effectiveness of DNA barcode sequences, as compared to K2P analysis of aligned sequences. It was also able to reduce the time required for analysis by over 15-fold, making it a far superior method for analyzing large datasets. We conclude that the CV method is a fast and reliable method for analyzing large datasets for DNA barcoding purposes.
Save time & money - Smart Internet Solutions     

Tango Jona
Tangokurs Rapperswil-Jona