Frequency-based similarity for parameterized sequences: Formal framework, algorithms, and applications

Authors:
Gianluigi Greco;Giorgio Terracina
Affiliations:
-;-
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 33
Cited 0

Subsumption and implication

Information Processing Letters
Parameterized pattern matching: algorithms and applications

Journal of Computer and System Sciences
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
Parameterized diff

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Faster suffix tree construction with missing suffix links

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
NP-Completeness of the Set Unification and Matching Problems

Proceedings of the 8th International Conference on Automated Deduction
Faster algorithms for the construction of parameterized suffix trees

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Workflow mining: a survey of issues and approaches

Data & Knowledge Engineering
Discovering Expressive Process Models by Clustering Log Traces

IEEE Transactions on Knowledge and Data Engineering
Encyclopedia of Algorithms

Encyclopedia of Algorithms
Parameterized matching with mismatches

Journal of Discrete Algorithms
Approximate parameterized matching

ACM Transactions on Algorithms (TALG)
Frequent pattern mining: current status and future directions

Data Mining and Knowledge Discovery
Conformance checking of processes based on monitoring real behavior

Information Systems
Mining taxonomies of process models

Data & Knowledge Engineering
Mining Loosely Structured Motifs from Biological Data

IEEE Transactions on Knowledge and Data Engineering
Semi-supervised kernel density estimation for video annotation

Computer Vision and Image Understanding
On the longest common parameterized subsequence

Theoretical Computer Science
Unified video annotation via multigraph learning

IEEE Transactions on Circuits and Systems for Video Technology
Beyond distance measurement: constructing neighborhood similarity for video annotation

IEEE Transactions on Multimedia - Special section on communities and media computing
Function matching: algorithms, applications, and a lower bound

ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Conversation mining in multi-agent systems

CEEMAS'03 Proceedings of the 3rd Central and Eastern European conference on Multi-agent systems
Trace alignment in process mining: opportunities for process diagnostics

BPM'10 Proceedings of the 8th international conference on Business process management
Automatic threshold estimation for data matching applications

Information Sciences: an International Journal
Modelling collaboration using complex networks

Information Sciences: an International Journal
Handling concept drift in process mining

CAiSE'11 Proceedings of the 23rd international conference on Advanced information systems engineering
Sequential pattern mining for situation and behavior prediction in simulated robotic soccer

RoboCup 2005
The prom framework: a new era in process mining tool support

ICATPN'05 Proceedings of the 26th international conference on Applications and Theory of Petri Nets
A methodology for predicting agent behavior by the use of data mining techniques

AIS-ADM 2005 Proceedings of the 2005 international conference on Autonomous Intelligent Systems: agents and Data Mining
Mining temporal patterns in popularity of web items

Information Sciences: an International Journal
Editorial: Mining usage scenarios in business processes: Outlier-aware discovery and run-time prediction

Data & Knowledge Engineering
Efficient video similarity measurement with video signature

IEEE Transactions on Circuits and Systems for Video Technology

Quantified Score

Hi-index	0.07

Visualization

Abstract

Computing sequence similarity is a problem emerging in several areas of research. Current solution algorithms are often based on alignment methods under the assumption that matching symbols, or at least a substitution schema among them, are known in advance. This is natural for sequences defined over the same alphabet of symbols. However, for sequences defined over different alphabets and in absence of an appropriate background knowledge, sequence similarity can be conveniently reconsidered from a different perspective where determining the best substitution schema is also part of the computation problem. The basic idea is that any symbol of a sequence can be correlated with many symbols of another, provided each correlation frequently occurs over the various positions of the alignment. This novel setting is formalized and relevant application domains fitting its peculiarities are illustrated. Moreover, the computational complexity of the alignment problems arising therein is analyzed, and practical solution approaches are proposed and validated over synthetic and real datasets.