An extension of the PMML standard to subspace clustering models

Authors:
Stephan Günnemann;Hardy Kremer;Thomas Seidl
Affiliations:
RWTH Aachen University, Aachen, Germany;RWTH Aachen University, Aachen, Germany;RWTH Aachen University, Aachen, Germany
Venue:
Proceedings of the 2011 workshop on Predictive markup language modeling
Year:
2011

Citing 22
Cited 0

Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Entropy-based subspace clustering for mining numerical data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques

Data mining: concepts and techniques
A Monte Carlo algorithm for fast projective clustering

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
When Is ''Nearest Neighbor'' Meaningful?

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Frequent-Pattern based Iterative Projected Clustering

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
SCHISM: A New Approach for Interesting Subspace Mining

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Iterative Projected Clustering by Subspace Mining

IEEE Transactions on Knowledge and Data Engineering
A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
P3C: A Robust Projected Clustering Algorithm

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projected and subspace clustering

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
ELKI: A Software System for Evaluation of Subspace Clustering Algorithms

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Simultaneous Unsupervised Learning of Disparate Clusterings

Statistical Analysis and Data Mining
Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)
INSCY: Indexing Subspace Clusters with In-Process-Removal of Redundancy

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Detection of orthogonal concepts in subspaces of high dimensional data

Proceedings of the 18th ACM conference on Information and knowledge management
What's PMML and what's new in PMML 4.0?

ACM SIGKDD Explorations Newsletter
Relevant Subspace Clustering: Mining the Most Interesting Non-redundant Concepts in High Dimensional Data

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Evaluating clustering in subspace projections of high dimensional data

Proceedings of the VLDB Endowment
Discovering Multiple Clustering Solutions: Grouping Objects in Different Views of the Data

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

In today's applications we face the challenge of analyzing databases with many attributes per object. For these high dimensional data it is known that traditional clustering algorithms fail to detect meaningful patterns: mining the full-space is futile. As a solution subspace clustering techniques were introduced. They analyze arbitrary subspace projections of the data to detect clustering structures. Recently, public available mining software integrates subspace clustering as a novel mining paradigm and sets the stage for its wide applicability. Though, a common standard to describe, exchange and process the subspace clustering results is still missing, which hinders the application in practice. In this work, we propose an extension of the PMML standard to describe mining models resulting from subspace clustering methods. Thus, we bridge the gap between the different tools and realize a common baseline the user can rely on. Our extension considers the various aspects subspace clustering models have to cope with, going beyond the ones of traditional clustering. We will integrate this novel PMML extension in the next version of our OpenSubspace toolkit.