A comparative study of dimensionality reduction techniques to enhance trace clustering performances

Authors:
M. Song;H. Yang;S. H. Siadat;M. Pechenizkiy
Affiliations:
School of Technology Management, Ulsan National Institute of Science and Technology, UNIST-GIL 50, 689-798 Ulsan, South Korea;School of Technology Management, Ulsan National Institute of Science and Technology, UNIST-GIL 50, 689-798 Ulsan, South Korea;School of Technology Management, Ulsan National Institute of Science and Technology, UNIST-GIL 50, 689-798 Ulsan, South Korea;Department of Computer Science, Eindhoven University of Technology, Den Dolech 2, 5612 AZ Eindhoven, The Netherlands
Venue:
Expert Systems with Applications: An International Journal
Year:
2013

Citing 35
Cited 0

Algorithms for clustering data

Algorithms for clustering data
Optimally regularized inverse of singular value decomposition and application to signal extrapolation

Signal Processing
Random projection in dimensionality reduction: applications to image and text data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Eigentaste: A Constant Time Collaborative Filtering Algorithm

Information Retrieval
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Spotting Topics with the Singular Value Decomposition

PODDP '98 Proceedings of the 4th International Workshop on Principles of Digital Document Processing
Database-friendly random projections: Johnson-Lindenstrauss with binary coins

Journal of Computer and System Sciences - Special issu on PODS 2001
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Workflow Mining: Discovering Process Models from Event Logs

IEEE Transactions on Knowledge and Data Engineering
Hierarchical Clustering Algorithms for Document Datasets

Data Mining and Knowledge Discovery
Discovering Expressive Process Models by Clustering Log Traces

IEEE Transactions on Knowledge and Data Engineering
Business process mining: An industrial application

Information Systems
A dimensionality reduction technique for efficient time series similarity analysis

Information Systems
Conformance checking of processes based on monitoring real behavior

Information Systems
Towards comprehensive support for organizational mining

Decision Support Systems
Analysis of a collaborative workflow process with distributed actors

Information Systems Frontiers
Process mining applied to the test process of wafer scanners in ASML

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews - Special issue on information reuse and integration
Redesigning business processes: a methodology based on simulation and process mining techniques

Knowledge and Information Systems
Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance

Electronic Notes in Theoretical Computer Science (ENTCS)
Fuzzy mining: adaptive process simplification based on multi-perspective metrics

BPM'07 Proceedings of the 5th international conference on Business process management
Time-interval process model discovery and validation--a genetic process mining approach

Applied Intelligence
Effective multiplicative updates for non-negative discriminative learning in multimodal dimensionality reduction

Artificial Intelligence Review
Hand gesture recognition based on segmented singular value decomposition

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
Process discovery in event logs: An application in the telecom industry

Applied Soft Computing
Data Mining: Practical Machine Learning Tools and Techniques

Data Mining: Practical Machine Learning Tools and Techniques
Improving the efficiency of multidimensional scaling in the analysis of high-dimensional data using singular value decomposition

Bioinformatics
A business process mining application for internal transaction fraud mitigation

Expert Systems with Applications: An International Journal
Comparison of classical dimensionality reduction methods with novel approach based on formal concept analysis

RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
Business process analysis in healthcare environments: A methodology based on process mining

Information Systems
An adaptive network intrusion detection method based on PCA and support vector machines

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Genetic process mining

ICATPN'05 Proceedings of the 26th international conference on Applications and Theory of Petri Nets
An effective method for approximating the euclidean distance in high-dimensional space

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Consumption universes based supermarket layout through association rule mining and multidimensional scaling

Expert Systems with Applications: An International Journal
A unified algebraic transformation approach for parallel recursiveand adaptive filtering and SVD algorithms

IEEE Transactions on Signal Processing
Probabilistic random projections and speaker verification

ICB'07 Proceedings of the 2007 international conference on Advances in Biometrics

Quantified Score

Hi-index	12.05

Visualization

Abstract

Process mining techniques have been used to analyze event logs from information systems in order to derive useful patterns. However, in the big data era, real-life event logs are huge, unstructured, and complex so that traditional process mining techniques have difficulties in the analysis of big logs. To reduce the complexity during the analysis, trace clustering can be used to group similar traces together and to mine more structured and simpler process models for each of the clusters locally. However, a high dimensionality of the feature space in which all the traces are presented poses different problems to trace clustering. In this paper, we study the effect of applying dimensionality reduction (preprocessing) techniques on the performance of trace clustering. In our experimental study we use three popular feature transformation techniques; singular value decomposition (SVD), random projection (RP), and principal components analysis (PCA), and the state-of-the art trace clustering in process mining. The experimental results on the dataset constructed from a real event log recorded from patient treatment processes in a Dutch hospital show that dimensionality reduction can improve trace clustering performance with respect to the computation time and average fitness of the mined local process models.