Mining
Sequential Patterns from Data Streams: a Centroid Approach. Alice
MARASCU, Florent MASSEGLIA.
La revue MODULAD, numéro 36, Juillet 2007
Abstract:
In recent years, emerging applications introduced
new constraints for data mining methods. These constraints
are typical of a new kind of data: the data streams.
In data stream processing, memory usage is restricted, new
elements are generated continuously and have to be considered as
fast as possible, no blocking operator can be performed
and the data can be
examined only once. At this time only a few methods has been proposed for
mining sequential patterns in data streams. We argue that the main reason
is the combinatory phenomenon related to sequential pattern mining.
In this paper, we propose an algorithm based on sequences alignment for
mining approximate sequential patterns in Web usage data streams. To
meet the constraint of one scan, a greedy clustering algorithm
associated to an alignment method is proposed. We will show that
our proposal is able to extract relevant sequences
with very low thresholds.
Keywords: data streams, sequential patterns, web usage mining,
clus-
tering, sequences alignment.
Download paper: Mining
Sequential Patterns from Data Streams: a Centroid Approach
Download slides: Mining
Sequential Patterns from Data Streams: a Centroid Approach
|