What's the code?: automatic classification of source code archives
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
The Journal of Machine Learning Research
MUDABlue: an automatic categorization system for open source repositories
Journal of Systems and Software - Special issue: Selected papers from the 11th Asia Pacific software engineering conference (APSEC 2004)
Mining concepts from code with probabilistic topic models
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Flickr tag recommendation based on collective knowledge
Proceedings of the 17th international conference on World Wide Web
Empirical Software Engineering
Using Latent Dirichlet Allocation for automatic categorization of software
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Latent dirichlet allocation for tag recommendation
Proceedings of the third ACM conference on Recommender systems
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Software traceability with topic modeling
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Codebook: discovering and exploiting relationships in software repositories
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Ontology emergence from folksonomies
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Automatic tag recommendation algorithms for social recommender systems
ACM Transactions on the Web (TWEB)
Geographical topic discovery and comparison
Proceedings of the 20th international conference on World wide web
On-demand feature recommendations derived from mining public product descriptions
Proceedings of the 33rd International Conference on Software Engineering
Categorizing software applications for maintenance
ICSM '11 Proceedings of the 2011 27th IEEE International Conference on Software Maintenance
Tag recommendation for open source software
Frontiers of Computer Science: Selected Publications from Chinese Universities
Hi-index | 0.00 |
Nowadays open source software has become an indispensable basis for both individual and industrial software engineering. Various kinds of labeling mechanisms like categories, keywords and tags are used in open source communities to annotate projects and facilitate the discovery of certain software. However, as large amounts of software are attached with no/few labels or the existing labels are from different ontology space, it is still hard to retrieve potentially topic-relevant software. This paper highlights the valuable semantic information of project descriptions and labels, proposes labeled software topic detection (LSTD), a hybrid approach combining topic models and ranking mechanisms to detect and enrich the topics of software by mining the large amount of textual software profiles, which can be employed to do software categorization and tag recommendation. L-STD makes use of labeled LDA to capture the semantic correlations between labels and descriptions and then construct the label-based topic-word matrix. Based on the generated matrix and the generality of labels, LSTD designs a simple yet efficient algorithm to detect the latent topics of software that expressed as relevant and popular labels. Comprehensive evaluations are conducted on the large-scale datasets of representative open source communities and the results validate the effectiveness of LSTD.