Rank-Based Sequence Aligners

by Liviu P. Dinu, Radu Tudor Ionescu, Alexandru I. Tomescu

Download the software package released under the GNU Public License.

If you use this software (or a modified version of it) in any scientific work, please cite the corresponding paper:

Liviu P. Dinu, Radu T. Ionescu, Alexandru I. Tomescu. Rank-Based Sequence Aligners with Applications in Phylogenetic Analysis. PLoS ONE 9(8): e104006, 2014. [BibTeX] [Download PDF]

Recent tools for aligning short DNA reads have been designed to maximize the trade-off between correctness and speed. The rank-based aligners aim to better satisfy these two needs.

Two methods for assigning a set of short DNA reads to a reference genome are presented here. These tools align the reads under rank distance (RD) and Local Rank Distance (LRD), respectively. Several indexing strategies to speed up the two aligners were proposed. Two strategies are investigated in the case of the RD aligner, such as using a prefix trie or a hash function with character frequencies. The LRD aligner is improved in terms of speed by storing k-mer positions in a hash table for each read. Another improvement, that produces an approximate LRD aligner, was to consider only the positions in the reference that are likely to represent a good positional match of the read.

The proposed aligners were evaluated and compared to other state of the art alignment tools (such as BWA, BOWTIE, and BLAST) in several experiments. The empirical results showed that the aligners proposed here can sometimes be considered as a good alternative to standard alignment tools, whether there is need for a sequence aligner that runs fast (such as the RD aligner) or for a sequence aligner that is highly accurate from a biological point of view (such as the LRD aligner).

Download the software package containing the RD aligner, the LRD aligner, and a SAM evaluation tool. The software is released under the GNU Public License. Please cite the paper if you use this software in your scientific work.