Le Monde des Utilisateurs de L'Analyse de Données

Numéro 36


Fundamentals of Analyzing and Mining Data Streams. Graham CORMODE
La revue MODULAD, numéro 36, Juillet 2007

Many scenarios, such as network analysis, utility monitoring, and financial applications, generate massive streams of data. These streams consist of millions or billions of simple updates every hour, and must be processed to extract the information described in tiny pieces. This survey provides an introduction the problems of data stream monitoring, and some of the techniques that have been developed over recent years to help mine the data while avoiding drowning in these massive flows of information. In particular, this tutorial introduces the fundamental techniques used to create compact summaries of data streams: sampling, sketching, and other synopsis techniques. It describes how to extract features such as clusters and association rules. Lastly, we see methods to detect when and how the process generating the stream is evolving, indicating some important change has occurred.

Keywords: Data streams, sampling, sketches, association rules, clustering, change detection.

Download paper : Fundamentals of Analyzing and Mining Data Streams

Download slides : Fundamentals of Analyzing and Mining Data Streams