Matching and indexing sequences of different lengths
CIKM '97 Proceedings of the sixth international conference on Information and knowledge management
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
A Scalable Algorithm for Clustering Sequential Data
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
SPIRIT: Sequential Pattern Mining with Regular Expression Constraints
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Pattern-Oriented Hierachical Clustering
ADBIS '99 Proceedings of the Third East European Conference on Advances in Databases and Information Systems
Mining Frequent Sequential Patterns under a Similarity Constraint
IDEAL '02 Proceedings of the Third International Conference on Intelligent Data Engineering and Automated Learning
ADMIT: anomaly-based data mining for intrusions
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TIME '97 Proceedings of the 4th International Workshop on Temporal Representation and Reasoning (TIME '97)
An Efficient Algorithm to Compute Differences between Structured Documents
IEEE Transactions on Knowledge and Data Engineering
Approximate mining of consensus sequential patterns
Approximate mining of consensus sequential patterns
Mining unexpected multidimensional rules
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Partial Symbol Ordering Distance
MDAI '09 Proceedings of the 6th International Conference on Modeling Decisions for Artificial Intelligence
Discovering novelty in gene data: from sequential patterns to visualization
ISVC'10 Proceedings of the 6th international conference on Advances in visual computing - Volume Part III
Effective next-items recommendation via personalized sequential pattern mining
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
Sequential patterns mining and gene sequence visualization to discover novelty from microarray data
Journal of Biomedical Informatics
Hi-index | 0.00 |
In data mining, computing the similarity of objects is an essential task, for example to identify regularities or to build homogeneous clusters of objects. In the case of sequential data seen in various fields of application (e.g. series of customers purchases, Internet navigation) this problem (i.e. comparing the similarity of sequences) is very important. There are already some similarity measures as Edit distance and LCS suited to simple sequences, but these measures are not relevant in the case of complex sequences composed of sets of items, as is the case of sequential patterns. In this paper, we propose a new similarity measure taking the characteristics of sequential patterns into account. S2 M P is an adjustable measure depending on the importance given to each characteristic of sequential patterns according to context, which is not the case of existing measures. We have experimented the accuracy and quality of S2 M P against Edit distance by using them in a clustering of sequential patterns. The results show that the clusters obtained by S2 M P are more homogeneous. Moreover these cluster are more precise and more complete according to the clusters obtained using Edit distance. The experiments show also that S2 M P is efficient in term of calculation time and size of used memory.