TY - GEN
T1 - Episode matching
AU - Das, Gautam
AU - Fleischer, Rudolf
AU - Gąsieniec, Leszek
AU - Gunopulos, Dimitris
AU - Kärkkäinen, Juha
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1997.
PY - 1997
Y1 - 1997
N2 - Given two words, text T of length n and episode P of length m, the episode matching problem is to find all minimal length substrings of text T that contain episode P as a subsequence. The respective optimization problem is to find the smallest number w, s.t. text T has a subword of length w which contains episode P. In this paper, we introduce a few efficient off-line as well as on-line algorithms for the entire problem, where by on-line algorithms we mean algorithms which search from left to right consecutive text symbols only once. We present two alphabet independent algorithms which work in time O(nm). The off-line algorithm operates in O(1) additional space while the on-line algorithm pays for its property with O(m) additional space. Two other on-line algorithms have subquadratic time complexity. One of them works in time O(nm/log m) and O(m) additional space. The other one gives a time/space trade-off, i.e., it works in time O(n + s +nm log log s/ log(s/m)) when additional space is limited to O(s). Finally, we present two approximation algorithms for the optimization problem. The off-line algorithm is alphabet independent, it has superlinear time complexity O(n/ε + n log log(n/m)) and it uses only constant space. The on-line algorithm works in time O(n/ε + n) and uses space O(m). Both approximation algorithms achieve 1 + ε approximation ratio, for any e > 0.
AB - Given two words, text T of length n and episode P of length m, the episode matching problem is to find all minimal length substrings of text T that contain episode P as a subsequence. The respective optimization problem is to find the smallest number w, s.t. text T has a subword of length w which contains episode P. In this paper, we introduce a few efficient off-line as well as on-line algorithms for the entire problem, where by on-line algorithms we mean algorithms which search from left to right consecutive text symbols only once. We present two alphabet independent algorithms which work in time O(nm). The off-line algorithm operates in O(1) additional space while the on-line algorithm pays for its property with O(m) additional space. Two other on-line algorithms have subquadratic time complexity. One of them works in time O(nm/log m) and O(m) additional space. The other one gives a time/space trade-off, i.e., it works in time O(n + s +nm log log s/ log(s/m)) when additional space is limited to O(s). Finally, we present two approximation algorithms for the optimization problem. The off-line algorithm is alphabet independent, it has superlinear time complexity O(n/ε + n log log(n/m)) and it uses only constant space. The on-line algorithm works in time O(n/ε + n) and uses space O(m). Both approximation algorithms achieve 1 + ε approximation ratio, for any e > 0.
UR - http://www.scopus.com/inward/record.url?scp=84879531103&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84879531103&partnerID=8YFLogxK
U2 - 10.1007/3-540-63220-4_46
DO - 10.1007/3-540-63220-4_46
M3 - Conference contribution
AN - SCOPUS:84879531103
SN - 9783540632207
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 12
EP - 27
BT - Combinatorial Pattern Matching - 8th Annual Symposium, CPM 1997, Proceedings
A2 - Apostolico, Alberto
A2 - Apostolico, Alberto
A2 - Hein, Jotun
PB - Springer Verlag
T2 - 8th Annual Symposium on Combinatorial Pattern Matching, CPM 1997
Y2 - 30 June 1997 through 2 July 1997
ER -