"K2 and K2*: efficient alignment-free sequence similarity measurement b" by Jie Lin, Donald A. Adjeroh et al.

Clinical and Translational Science Institute

Title

K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics

Authors

Jie Lin, Fujian Normal University
Donald A. Adjeroh, West Virginia University
Bing-Hua Jiang, University of Iowa
Yue Jiang, Fujian Normal University

Document Type

Article

Publication Date

5-15-2018

Department/Program/Center

Lane Department of Computer Science and Electrical Engineering

Abstract

Motivation: Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. Results: We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K 2 , is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. Availability and implementation: The K2 and K 2 approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/ K2/K2_1.0.tar.gz).

Digital Commons Citation

Lin, Jie; Adjeroh, Donald A.; Jiang, Bing-Hua; and Jiang, Yue, "K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics" (2018). Clinical and Translational Science Institute. 957.
https://researchrepository.wvu.edu/ctsi/957

Source Citation

Lin J, Adjeroh DA, Jiang B-H, Jiang Y. K 2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics. Hancock J, ed. Bioinformatics. 2017;34(10):1682-1689. doi:10.1093/bioinformatics/btx809

Download

Included in

Medicine and Health Sciences Commons

COinS

Clinical and Translational Science Institute

Title

Authors

Document Type

Publication Date

Department/Program/Center

Abstract

Digital Commons Citation

Source Citation

Included in

Browse

Resources

Search

Author Corner

Clinical and Translational Science Institute

Title

Authors

Document Type

Publication Date

Department/Program/Center

Abstract

Digital Commons Citation

Source Citation

Included in

Share

Browse

Resources

Search

Author Corner