TY - GEN
T1 - Faster algorithm for the set variant of the string barcoding problem
AU - Ga̧sieniec, Leszek
AU - Li, Cindy Y.
AU - Zhang, Meng
PY - 2008/7/1
Y1 - 2008/7/1
N2 - A string barcoding problem is defined as to find a minimum set of substrings that distinguish between all strings in a given set of strings . In a biological sense the given strings represent a set of genomic sequences and the substrings serve as probes in a hybridisation experiment. In this paper, we study a variant of the string barcoding problem in which the substrings have to be chosen from a particular set of substrings of cardinality n. This variant can be also obtained from more general test set problem, see, e.g., [1] by fixing appropriate parameters. We present almost optimal -time approximation algorithm for the considered problem. Our approximation procedure is a modification of the algorithm due to Berman et al. [1] which obtains the best possible approximation ratio (1∈+∈ln n), providing . The improved time complexity is a direct consequence of more careful management of processed sets, use of several specialised graph and string data structures as well as tighter time complexity analysis based on an amortised argument.
AB - A string barcoding problem is defined as to find a minimum set of substrings that distinguish between all strings in a given set of strings . In a biological sense the given strings represent a set of genomic sequences and the substrings serve as probes in a hybridisation experiment. In this paper, we study a variant of the string barcoding problem in which the substrings have to be chosen from a particular set of substrings of cardinality n. This variant can be also obtained from more general test set problem, see, e.g., [1] by fixing appropriate parameters. We present almost optimal -time approximation algorithm for the considered problem. Our approximation procedure is a modification of the algorithm due to Berman et al. [1] which obtains the best possible approximation ratio (1∈+∈ln n), providing . The improved time complexity is a direct consequence of more careful management of processed sets, use of several specialised graph and string data structures as well as tighter time complexity analysis based on an amortised argument.
UR - http://www.scopus.com/inward/record.url?scp=45849139741&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=45849139741&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-69068-9_10
DO - 10.1007/978-3-540-69068-9_10
M3 - Conference contribution
AN - SCOPUS:45849139741
SN - 3540690662
SN - 9783540690665
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 82
EP - 94
BT - Combinatorial Pattern Matching - 19th Annual Symposium, CPM 2008, Proceedings
T2 - 19th Annual Symposium on Combinatorial Pattern Matching, CPM 2008
Y2 - 18 June 2008 through 20 June 2008
ER -