Data guided approach to generate multi-dimensional schema for targeted knowledge discovery

Authors:
Muhammad Usman;Russel Pears;A. C. M. Fong
Affiliations:
Auckland University of Technology, Auckland, New Zealand;Auckland University of Technology, Auckland, New Zealand;Auckland University of Technology, Auckland, New Zealand
Venue:
AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Year:
2012

Citing 14
Cited 0

Towards on-line analytical mining in large databases

ACM SIGMOD Record
PARSIMONY: An infrastructure for parallel multidimensional analysis and data mining

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
High Performance OLAP and Data Mining on Parallel Computers

Data Mining and Knowledge Discovery
Unsupervised Learning with Mixed Numeric and Nominal Data

IEEE Transactions on Knowledge and Data Engineering
A Parallel Scalable Infrastructure for OLAP and Data Mining

IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Mapping nominal values to numbers for effective visualization

Information Visualization - Special issue of selected and extended InfoVis 03 papers
Interpretable Hierarchical Clustering by Constructing an Unsupervised Decision Tree

IEEE Transactions on Knowledge and Data Engineering
Toward Intelligent Assistance for a Data Mining Process: An Ontology-Based Approach for Cost-Sensitive Classification

IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Knowledge Discovery in High-Dimensional Data: Case Studies and a User Survey for the Rank-by-Feature Framework

IEEE Transactions on Visualization and Computer Graphics
Enhanced mining of association rules from data cubes

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
A k-mean clustering algorithm for mixed numeric and categorical data

Data & Knowledge Engineering
Data warehousing and knowledge discovery: a chronological view of research challenges

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Integrating clustering data mining into the multidimensional modeling of data warehouses with UML profiles

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining and data warehousing are two key technologies which have made significant contributions to the field of knowledge discovery in a variety of domains. More recently, the integrated use of traditional data mining techniques such as clustering and pattern recognition with data warehousing technique of Online Analytical Processing (OLAP) have motivated diverse research areas for leveraging knowledge discovery from complex real-world datasets. Recently, a number of such integrated methodologies have been proposed to extract knowledge from datasets but most of these methodologies lack automated and generic methods for schema generation and knowledge extraction. Mostly data analysts need to rely on domain specific knowledge and have to cope with technological constraints in order to discover knowledge from high dimensional datasets. In this paper we present a generic methodology which incorporates semi-automated knowledge extraction methods to provide data-driven assistance towards knowledge discovery. In particular, we provide a method for constructing a binary tree of hierarchical clusters and annotate each node in the tree with significant numeric variables. Additionally, we propose automated methods to rank nominal variables and to generate candidate multidimensional schema with highly significant dimensions. We have performed three case studies on three real-world datasets taken from the UCI machine learning repository in order to validate the generality and applicability of our proposed methodology.