Faster algorithm for the set variant of the string barcoding problem

Leszek Ga̧sieniec; Cindy Y. Li; Meng Zhang

doi:10.1007/978-3-540-69068-9_10

Faster algorithm for the set variant of the string barcoding problem

Leszek Ga̧sieniec, Cindy Y. Li, Meng Zhang

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

A string barcoding problem is defined as to find a minimum set of substrings that distinguish between all strings in a given set of strings . In a biological sense the given strings represent a set of genomic sequences and the substrings serve as probes in a hybridisation experiment. In this paper, we study a variant of the string barcoding problem in which the substrings have to be chosen from a particular set of substrings of cardinality n. This variant can be also obtained from more general test set problem, see, e.g., [1] by fixing appropriate parameters. We present almost optimal -time approximation algorithm for the considered problem. Our approximation procedure is a modification of the algorithm due to Berman et al. [1] which obtains the best possible approximation ratio (1∈+∈ln n), providing . The improved time complexity is a direct consequence of more careful management of processed sets, use of several specialised graph and string data structures as well as tighter time complexity analysis based on an amortised argument.

Original language	English (US)
Title of host publication	Combinatorial Pattern Matching - 19th Annual Symposium, CPM 2008, Proceedings
Pages	82-94
Number of pages	13
DOIs	https://doi.org/10.1007/978-3-540-69068-9_10
State	Published - 2008
Externally published	Yes
Event	19th Annual Symposium on Combinatorial Pattern Matching, CPM 2008 - Pisa, Italy Duration: Jun 18 2008 → Jun 20 2008

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	5029 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	19th Annual Symposium on Combinatorial Pattern Matching, CPM 2008
Country/Territory	Italy
City	Pisa
Period	6/18/08 → 6/20/08

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Access to Document

10.1007/978-3-540-69068-9_10

Cite this

Ga̧sieniec, L., Li, C. Y., & Zhang, M. (2008). Faster algorithm for the set variant of the string barcoding problem. In Combinatorial Pattern Matching - 19th Annual Symposium, CPM 2008, Proceedings (pp. 82-94). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5029 LNCS). https://doi.org/10.1007/978-3-540-69068-9_10

Faster algorithm for the set variant of the string barcoding problem. / Ga̧sieniec, Leszek; Li, Cindy Y.; Zhang, Meng.
Combinatorial Pattern Matching - 19th Annual Symposium, CPM 2008, Proceedings. 2008. p. 82-94 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5029 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Ga̧sieniec, L, Li, CY & Zhang, M 2008, Faster algorithm for the set variant of the string barcoding problem. in Combinatorial Pattern Matching - 19th Annual Symposium, CPM 2008, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5029 LNCS, pp. 82-94, 19th Annual Symposium on Combinatorial Pattern Matching, CPM 2008, Pisa, Italy, 6/18/08. https://doi.org/10.1007/978-3-540-69068-9_10

@inproceedings{b9d0d415bb034763893e2bde6a24547f,

title = "Faster algorithm for the set variant of the string barcoding problem",

abstract = "A string barcoding problem is defined as to find a minimum set of substrings that distinguish between all strings in a given set of strings . In a biological sense the given strings represent a set of genomic sequences and the substrings serve as probes in a hybridisation experiment. In this paper, we study a variant of the string barcoding problem in which the substrings have to be chosen from a particular set of substrings of cardinality n. This variant can be also obtained from more general test set problem, see, e.g., [1] by fixing appropriate parameters. We present almost optimal -time approximation algorithm for the considered problem. Our approximation procedure is a modification of the algorithm due to Berman et al. [1] which obtains the best possible approximation ratio (1∈+∈ln n), providing . The improved time complexity is a direct consequence of more careful management of processed sets, use of several specialised graph and string data structures as well as tighter time complexity analysis based on an amortised argument.",

author = "Leszek G{\c a}sieniec and Li, {Cindy Y.} and Meng Zhang",

year = "2008",

doi = "10.1007/978-3-540-69068-9_10",

language = "English (US)",

isbn = "3540690662",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "82--94",

booktitle = "Combinatorial Pattern Matching - 19th Annual Symposium, CPM 2008, Proceedings",

note = "19th Annual Symposium on Combinatorial Pattern Matching, CPM 2008 ; Conference date: 18-06-2008 Through 20-06-2008",

}

TY - GEN

T1 - Faster algorithm for the set variant of the string barcoding problem

AU - Ga̧sieniec, Leszek

AU - Li, Cindy Y.

AU - Zhang, Meng

PY - 2008

Y1 - 2008

N2 - A string barcoding problem is defined as to find a minimum set of substrings that distinguish between all strings in a given set of strings . In a biological sense the given strings represent a set of genomic sequences and the substrings serve as probes in a hybridisation experiment. In this paper, we study a variant of the string barcoding problem in which the substrings have to be chosen from a particular set of substrings of cardinality n. This variant can be also obtained from more general test set problem, see, e.g., [1] by fixing appropriate parameters. We present almost optimal -time approximation algorithm for the considered problem. Our approximation procedure is a modification of the algorithm due to Berman et al. [1] which obtains the best possible approximation ratio (1∈+∈ln n), providing . The improved time complexity is a direct consequence of more careful management of processed sets, use of several specialised graph and string data structures as well as tighter time complexity analysis based on an amortised argument.

AB - A string barcoding problem is defined as to find a minimum set of substrings that distinguish between all strings in a given set of strings . In a biological sense the given strings represent a set of genomic sequences and the substrings serve as probes in a hybridisation experiment. In this paper, we study a variant of the string barcoding problem in which the substrings have to be chosen from a particular set of substrings of cardinality n. This variant can be also obtained from more general test set problem, see, e.g., [1] by fixing appropriate parameters. We present almost optimal -time approximation algorithm for the considered problem. Our approximation procedure is a modification of the algorithm due to Berman et al. [1] which obtains the best possible approximation ratio (1∈+∈ln n), providing . The improved time complexity is a direct consequence of more careful management of processed sets, use of several specialised graph and string data structures as well as tighter time complexity analysis based on an amortised argument.

UR - http://www.scopus.com/inward/record.url?scp=45849139741&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=45849139741&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-69068-9_10

DO - 10.1007/978-3-540-69068-9_10

M3 - Conference contribution

AN - SCOPUS:45849139741

SN - 3540690662

SN - 9783540690665

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 82

EP - 94

BT - Combinatorial Pattern Matching - 19th Annual Symposium, CPM 2008, Proceedings

T2 - 19th Annual Symposium on Combinatorial Pattern Matching, CPM 2008

Y2 - 18 June 2008 through 20 June 2008

ER -

Faster algorithm for the set variant of the string barcoding problem

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this