Towards supporting expert evaluation of clustering results using a data mining process model

Authors:
Kweku-Muata Osei-Bryson
Affiliations:
Department of Information Systems and The Information Systems Research Institute, Virginia Commonwealth University, Richmond, VA 23284, USA
Venue:
Information Sciences: an International Journal
Year:
2010

Citing 34
Cited 7

Similarity measures in scientometric research: the Jaccard index versus Salton's cosine formula

Information Processing and Management: an International Journal
Cluster analysis and related issues

Handbook of pattern recognition & computer vision
Probabilistic models in cluster analysis

Computational Statistics & Data Analysis - Special issue on classification
A new cluster validity index for the fuzzy c-mean

Pattern Recognition Letters
Clustering gene expression patterns

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Concept Learning and Feature Selection Based on Square-Error Clustering

Machine Learning
Data clustering: a review

ACM Computing Surveys (CSUR)
Predictive modeling in automotive direct marketing: tools, experiences and open issues

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
Cluster validity methods: part I

ACM SIGMOD Record
Dealing with the Expert Inconsistency in Probability Elicitation

IEEE Transactions on Knowledge and Data Engineering
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Cluster validation techniques for genome expression data

Signal Processing - Special issue: Genomic signal processing
Selection of web sites for online advertising using the AHP

Information and Management
Evaluation of decision trees: a multi-criteria approach

Computers and Operations Research
Understanding software project risk: a cluster analysis

Information and Management
Automated Variable Weighting in k-Means Type Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
What do we know about mobile internet adopters? A cluster analysis

Information and Management
Assimilation patterns in the use of electronic procurement innovations: a cluster analysis

Information and Management
A survey of Knowledge Discovery and Data Mining process models

The Knowledge Engineering Review
Semantic-based facial expression recognition using analytical hierarchy process

Expert Systems with Applications: An International Journal
Ranking discovered rules from data mining with multiple criteria by data envelopment analysis

Expert Systems with Applications: An International Journal
A hybrid clustering algorithm

Computers and Operations Research
Hierarchical clustering of mixed data based on distance hierarchy

Information Sciences: an International Journal
Weighted order-dependent clustering and visualization of web navigation patterns

Decision Support Systems
A clustering method to identify representative financial ratios

Information Sciences: an International Journal
Clustering algorithm for intuitionistic fuzzy sets

Information Sciences: an International Journal
Clustering high dimensional data: A graph-based relaxed optimization approach

Information Sciences: an International Journal
Visualization of multi-algorithm clustering for better economic decisions - The case of car pricing

Decision Support Systems
Prioritization of association rules in data mining: Multiple criteria decision approach

Expert Systems with Applications: An International Journal

Validation of overlapping clustering: A random clustering perspective

Information Sciences: an International Journal
A context-aware data mining process model based framework for supporting evaluation of data mining results

Expert Systems with Applications: An International Journal
A clustering algorithm for multiple data streams based on spectral component similarity

Information Sciences: an International Journal
Formal context coverage based on isolated labels: An efficient solution for text feature extraction

Information Sciences: an International Journal
A bio inspired fuzzy k-modes clustring algorithm

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Ranking and selection of unsupervised learning marketing segmentation

Knowledge-Based Systems
Investigating the relationship among self-leadership strategies by association rules mining

International Journal of Business Information Systems

Quantified Score

Hi-index	0.07

Visualization

Abstract

Clustering is a popular non-directed learning data mining technique for partitioning a dataset into a set of clusters (i.e. a segmentation). Although there are many clustering algorithms, none is superior on all datasets, and so it is never clear which algorithm and which parameter settings are the most appropriate for a given dataset. This suggests that an appropriate approach to clustering should involve the application of multiple clustering algorithms with different parameter settings and a non-taxing approach for comparing the various segmentations that would be generated by these algorithms. In this paper we are concerned with the situation where a domain expert has to evaluate several segmentations in order to determine the most appropriate segmentation (set of clusters) based on his/her specified objective(s). We illustrate how a data mining process model could be applied to address this problem.