TY - JOUR
T1 - Improved approximate common interval
AU - Amir, Amihood
AU - Gasieniec, Leszek
AU - Shalom, Riva
N1 - Funding Information:
* Corresponding author at: Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel. Tel.: +972 3 531 8770. E-mail addresses: amir@cs.biu.ac.il (A. Amir), leszek@csc.liv.ac.uk (L. Gasieniec), gonenr1@cs.biu.ac.il (R. Shalom). 1 Partly supported by NSF grant CCR-01-04494 and ISF grant 35/05. 2 Tel.: +44 151 794 3686. 3 Tel.: +972 3 531 8408.
PY - 2007/8/16
Y1 - 2007/8/16
N2 - The approximate common interval (ACI) problem, where the multiple genome strings are required to be compared to all other character sets of the other string is discussed. Genomes are considered as strings, with possible repeats of symbols representing paralogous genes, and detect the gene clusters by modeling gene intervals by the set of characters. A specific number of time algorithm that locates all intervals of two strings share the same character set, which also represents the number of the strings. This approximate common interval (ACI) problem for a specific number of strings can be solved in time and space by considering a finite length of every string. A procedure for extracting all maximal character sets of the input strings, and the ACI problem for a single input string and multiple input strings are studied. Graphic representation shows provides a simple and versatile algorithm, supporting the approximate common interval problem.
AB - The approximate common interval (ACI) problem, where the multiple genome strings are required to be compared to all other character sets of the other string is discussed. Genomes are considered as strings, with possible repeats of symbols representing paralogous genes, and detect the gene clusters by modeling gene intervals by the set of characters. A specific number of time algorithm that locates all intervals of two strings share the same character set, which also represents the number of the strings. This approximate common interval (ACI) problem for a specific number of strings can be solved in time and space by considering a finite length of every string. A procedure for extracting all maximal character sets of the input strings, and the ACI problem for a single input string and multiple input strings are studied. Graphic representation shows provides a simple and versatile algorithm, supporting the approximate common interval problem.
KW - Computational biology
KW - Design of algorithms
KW - Gene evolution
KW - Hamming distance
KW - Pattern matching
UR - http://www.scopus.com/inward/record.url?scp=34249011129&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34249011129&partnerID=8YFLogxK
U2 - 10.1016/j.ipl.2007.03.006
DO - 10.1016/j.ipl.2007.03.006
M3 - Article
AN - SCOPUS:34249011129
SN - 0020-0190
VL - 103
SP - 142
EP - 149
JO - Information Processing Letters
JF - Information Processing Letters
IS - 4
ER -