A consensus based approach to constrained clustering of software requirements

Authors:
Chuan Duan;Jane Cleland-Huang;Bamshad Mobasher
Affiliations:
DePaul University, Chicago, IL, USA;DePaul University, Chicago, IL, USA;DePaul University, Chicago, IL, USA
Venue:
Proceedings of the 17th ACM conference on Information and knowledge management
Year:
2008

Citing 18
Cited 7

Algorithms for clustering data

Algorithms for clustering data
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Combining Multiple Weak Clusterings

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A probabilistic framework for semi-supervised clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Solving cluster ensemble problems by bipartite graph partitioning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Combining Multiple Clusterings Using Evidence Accumulation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Intractability and clustering with constraints

Proceedings of the 24th international conference on Machine learning
Enhancing semi-supervised clustering: a feature projection perspective

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering support for automated tracing

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Using data mining and recommender systems to scale up the requirements process

Proceedings of the 2nd international workshop on Ultra-large-scale software-intensive systems
Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Using Data Mining and Recommender Systems to Facilitate Large-Scale, Open, and Inclusive Requirements Elicitation Processes

RE '08 Proceedings of the 2008 16th IEEE International Requirements Engineering Conference
Identifying and generating easy sets of constraints for clustering

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Measuring constraint-set utility for partitional clustering algorithms

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Agglomerative hierarchical clustering with constraints: theoretical and empirical results

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

Automated support for managing feature requests in open forums

Communications of the ACM - A View of Parallel Computing
Lessons Learned from Open Source Projects for Facilitating Online Requirements Processes

REFSQ '09 Proceedings of the 15th International Working Conference on Requirements Engineering: Foundation for Software Quality
Utilizing recommender systems to support software requirements elicitation

Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering
On-demand feature recommendations derived from mining public product descriptions

Proceedings of the 33rd International Conference on Software Engineering
Creating design from requirements and use cases: bridging the gap between requirement and detailed design

Proceedings of the 5th India Software Engineering Conference
Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems

Knowledge-Based Systems
Semi-supervised clustering ensemble based on multi-ant colonies algorithm

RSKT'12 Proceedings of the 7th international conference on Rough Sets and Knowledge Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

Managing large-scale software projects involves a number of activities such as viewpoint extraction, feature detection, and requirements management, all of which require a human analyst to perform the arduous task of organizing requirements into meaningful topics and themes. Automating these tasks through the use of data mining techniques such as clustering could potentially increase both the efficiency of performing the tasks and the reliability of the results. Unfortunately, the unique characteristics of this domain, such as high dimensional, sparse, noisy data sets, resulting from short and ambiguous expressions of need, as well as the need for the interactive engagement of stakeholders at various stages of the process, present difficult challenges for standard clustering algorithms. In this paper, we propose a semi-supervised clustering framework, based on a combination of consensus-based and constrained clustering techniques, which can effectively handle these challenges. Specifically, we provide a probabilistic analysis for informative constraint generation based on a co-association matrix, and utilize consensus clustering to combine multiple constrained partitions in order to generate high-quality, robust clusters. Our approach is validated through a series of experiments on six well-studied TREC data sets and on two sets of user requirements.