Episode matching

Gautam Das, Rudolf Fleischer, Leszek Gąsieniec, Dimitris Gunopulos, Juha Kärkkäinen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

54 Scopus citations

Abstract

Given two words, text T of length n and episode P of length m, the episode matching problem is to find all minimal length substrings of text T that contain episode P as a subsequence. The respective optimization problem is to find the smallest number w, s.t. text T has a subword of length w which contains episode P. In this paper, we introduce a few efficient off-line as well as on-line algorithms for the entire problem, where by on-line algorithms we mean algorithms which search from left to right consecutive text symbols only once. We present two alphabet independent algorithms which work in time O(nm). The off-line algorithm operates in O(1) additional space while the on-line algorithm pays for its property with O(m) additional space. Two other on-line algorithms have subquadratic time complexity. One of them works in time O(nm/log m) and O(m) additional space. The other one gives a time/space trade-off, i.e., it works in time O(n + s +nm log log s/ log(s/m)) when additional space is limited to O(s). Finally, we present two approximation algorithms for the optimization problem. The off-line algorithm is alphabet independent, it has superlinear time complexity O(n/ε + n log log(n/m)) and it uses only constant space. The on-line algorithm works in time O(n/ε + n) and uses space O(m). Both approximation algorithms achieve 1 + ε approximation ratio, for any e > 0.

Original languageEnglish (US)
Title of host publicationCombinatorial Pattern Matching - 8th Annual Symposium, CPM 1997, Proceedings
EditorsAlberto Apostolico, Alberto Apostolico, Jotun Hein
PublisherSpringer Verlag
Pages12-27
Number of pages16
ISBN (Print)9783540632207
DOIs
StatePublished - 1997
Externally publishedYes
Event8th Annual Symposium on Combinatorial Pattern Matching, CPM 1997 - Aarhus, Denmark
Duration: Jun 30 1997Jul 2 1997

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1264
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th Annual Symposium on Combinatorial Pattern Matching, CPM 1997
Country/TerritoryDenmark
CityAarhus
Period6/30/977/2/97

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Episode matching'. Together they form a unique fingerprint.

Cite this