Local Rank Distance

by Radu Tudor Ionescu

Download the LRD 1.0 software package released under the GNU Public License. The LRD 1.0 software package contains a Java implementation of the efficient algorithm for Local Rank Distance presented in [2, 3].

If you use this software (or a modified version of it) in any scientific work, please cite the corresponding works:

[1] Radu Tudor Ionescu. Local Rank Distance. Proceedings of SYNASC, pp. 221–228, 2013. [BibTeX]

[2] Radu Tudor Ionescu. A Fast Algorithm for Local Rank Distance: Application to Arabic Native Language Identification. Proceedings of ICONIP, LNCS vol. 9490, pp. 390–400, 2015. [BibTeX]

[3] Radu Tudor Ionescu and Marius Popescu. Knowledge Transfer between Computer Vision and Text Mining: Similarity-based Learning Approaches. Springer, 2016. [BibTeX]

Local Rank Distance (LRD) is a distance measure introduced in [1] that comes from the idea of better adapting rank distance to string data, in order to capture a better similarity (or dissimilarity) between strings, such as DNA sequences or text.

LRD has already shown promising results in computational biology [1] and native language identification [2]. The LRD 1.0 software package is based on the efficient algorithm introduced in [2].

LRD is inspired by rank distance, the main differences being that it uses p-grams instead of single characters, and that it matches each p-gram in the first string with the nearest equal p-gram in the second string. An extensive presentation of LRD is provided in [3].

In combination with other string kernels, LRD reaches state-of-the-art results in various text classification tasks.

An alignment tool based on LRD is available here.

An extension of LRD for gesture recognition (matching temporal paths) is available here.