Frequent pattern mining in data streams

Victor E. Lee, Ruoming Jin, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceedingChapter

25 Scopus citations

Abstract

As the volume of digital commerce and communication has exploded, the demand for data mining of streaming data has likewise grown. One of the fundamental data mining tasks, for both static and streaming data, is frequent pattern mining. The goal of pattern mining is to identity frequently occurring patterns and structures. Such patterns may indicate scientific phenomena, economic or social trends, or even security threats. Moreover, not only is pattern discovery important by itself, but it is also a building block for machine learning tasks such as association rule induction. Traditionally, algorithms for pattern discovery have processed the entire dataset as a batch, with no restriction on how many passes through the data would be taken. However, when the data are arriving in a continuous and unending stream, our algorithm must be limited to a single pass. Moreover, the length of the stream is indeterminate, so we cannot wait for it to end. We generate an initial result after seeing a certain quantity of data, and then we periodically revise the result. A particular challenge for frequent pattern discovery is the combinatorial explosion of candidate patterns In this chapter, we present a structured review of online frequent pattern mining techniques. We classify the methods according to the type of pattern and data, the time window being considered, and the quality of the approximation.

Original languageEnglish (US)
Title of host publicationFrequent Pattern Mining
EditorsCharu C. Aggarwal
PublisherSpringer International Publishing
Pages199-224
Number of pages26
Volume9783319078212
ISBN (Electronic)9783319078212
ISBN (Print)3319078208, 9783319078205
DOIs
StatePublished - 2007
Externally publishedYes

Keywords

  • Frequent pattern mining
  • Lossy counting
  • Sliding window
  • Streaming data

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Frequent pattern mining in data streams'. Together they form a unique fingerprint.

Cite this