Approximate String Joins in a Database (Almost) for Free
Proceedings of the 27th International Conference on Very Large Data Bases
TIME '97 Proceedings of the 4th International Workshop on Temporal Representation and Reasoning (TIME '97)
Efficient set joins on similarity predicates
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
High-performance complex event processing over streams
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Efficient exact set-similarity joins
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Cayuga: a high-performance event processing engine
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Concepts and models for typing events for event-based systems
Proceedings of the 2007 inaugural international conference on Distributed event-based systems
Events and streams: harnessing and unleashing their synergy!
Proceedings of the second international conference on Distributed event-based systems
Bridging physical and virtual worlds: complex event processing for RFID data streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Processing count queries over event streams at multiple time granularities
Information Sciences: an International Journal
Hi-index | 0.00 |
With the development of event-driven applications, event stream processing has received more and more attentions in database community. However, little work has focused on the problem of data mining and similarity analysis among event streams. As the foundation for the data mining such as frequent or abnormal event pattern detection, efficient similarity search is desired to be first executed. In this paper, we attempt to take the first step into the similarity search in the context of vast event streams. We propose a simple but effective model to improve the efficiency of the similarity search. To avoid redundant pair-wise comparison, we adopt the definition of sharing extent to dramatically filter dissimilar event streams and speed up the calculation of similarity. Extensive simulated experiments have demonstrated that our model and algorithm can lead to higher efficiency when guaranteeing expected accuracy.