Algorithms for clustering data
Algorithms for clustering data
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Combining Multiple Weak Clusterings
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Solving cluster ensemble problems by bipartite graph partitioning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Combining Multiple Clusterings Using Evidence Accumulation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Intractability and clustering with constraints
Proceedings of the 24th international conference on Machine learning
Enhancing semi-supervised clustering: a feature projection perspective
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering support for automated tracing
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Using data mining and recommender systems to scale up the requirements process
Proceedings of the 2nd international workshop on Ultra-large-scale software-intensive systems
Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
RE '08 Proceedings of the 2008 16th IEEE International Requirements Engineering Conference
Identifying and generating easy sets of constraints for clustering
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Measuring constraint-set utility for partitional clustering algorithms
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Agglomerative hierarchical clustering with constraints: theoretical and empirical results
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Automated support for managing feature requests in open forums
Communications of the ACM - A View of Parallel Computing
Lessons Learned from Open Source Projects for Facilitating Online Requirements Processes
REFSQ '09 Proceedings of the 15th International Working Conference on Requirements Engineering: Foundation for Software Quality
Utilizing recommender systems to support software requirements elicitation
Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering
On-demand feature recommendations derived from mining public product descriptions
Proceedings of the 33rd International Conference on Software Engineering
Proceedings of the 5th India Software Engineering Conference
Semi-supervised clustering ensemble based on multi-ant colonies algorithm
RSKT'12 Proceedings of the 7th international conference on Rough Sets and Knowledge Technology
Hi-index | 0.01 |
Managing large-scale software projects involves a number of activities such as viewpoint extraction, feature detection, and requirements management, all of which require a human analyst to perform the arduous task of organizing requirements into meaningful topics and themes. Automating these tasks through the use of data mining techniques such as clustering could potentially increase both the efficiency of performing the tasks and the reliability of the results. Unfortunately, the unique characteristics of this domain, such as high dimensional, sparse, noisy data sets, resulting from short and ambiguous expressions of need, as well as the need for the interactive engagement of stakeholders at various stages of the process, present difficult challenges for standard clustering algorithms. In this paper, we propose a semi-supervised clustering framework, based on a combination of consensus-based and constrained clustering techniques, which can effectively handle these challenges. Specifically, we provide a probabilistic analysis for informative constraint generation based on a co-association matrix, and utilize consensus clustering to combine multiple constrained partitions in order to generate high-quality, robust clusters. Our approach is validated through a series of experiments on six well-studied TREC data sets and on two sets of user requirements.