Abstract
As the volume of digital commerce and communication has exploded, the demand for data mining of streaming data has likewise grown. One of the fundamental data mining tasks, for both static and streaming data, is frequent pattern mining. The goal of pattern mining is to identity frequently occurring patterns and structures. Such patterns may indicate scientific phenomena, economic or social trends, or even security threats. Moreover, not only is pattern discovery important by itself, but it is also a building block for machine learning tasks such as association rule induction. Traditionally, algorithms for pattern discovery have processed the entire dataset as a batch, with no restriction on how many passes through the data would be taken. However, when the data are arriving in a continuous and unending stream, our algorithm must be limited to a single pass. Moreover, the length of the stream is indeterminate, so we cannot wait for it to end. We generate an initial result after seeing a certain quantity of data, and then we periodically revise the result. A particular challenge for frequent pattern discovery is the combinatorial explosion of candidate patterns In this chapter, we present a structured review of online frequent pattern mining techniques. We classify the methods according to the type of pattern and data, the time window being considered, and the quality of the approximation.
Original language | English (US) |
---|---|
Title of host publication | Frequent Pattern Mining |
Editors | Charu C. Aggarwal |
Publisher | Springer International Publishing |
Pages | 199-224 |
Number of pages | 26 |
Volume | 9783319078212 |
ISBN (Electronic) | 9783319078212 |
ISBN (Print) | 3319078208, 9783319078205 |
DOIs | |
State | Published - 2007 |
Externally published | Yes |
Keywords
- Frequent pattern mining
- Lossy counting
- Sliding window
- Streaming data
ASJC Scopus subject areas
- Computer Science(all)