Using text mining and link analysis for software mining

Authors:
Miha Grcar;Marko Grobelnik;Dunja Mladenic
Affiliations:
Jozef Stefan Institute, Dept. of Knowledge Technologies, Ljubljana, Slovenia;Jozef Stefan Institute, Dept. of Knowledge Technologies, Ljubljana, Slovenia;Jozef Stefan Institute, Dept. of Knowledge Technologies, Ljubljana, Slovenia
Venue:
MCD'07 Proceedings of the 3rd ECML/PKDD international conference on Mining complex data
Year:
2007

Citing 6
Cited 0

An algorithm for drawing general undirected graphs

Information Processing Letters
Integrating information retrieval and domain specific approaches for browsing and retrieval in object-oriented class libraries

OOPSLA '91 Conference proceedings on Object-oriented programming systems, languages, and applications
ScentTrails: Integrating browsing and searching on the Web

ACM Transactions on Computer-Human Interaction (TOCHI)
The Download Estimation task on KDD Cup 2003

ACM SIGKDD Explorations Newsletter
Exploratory Social Network Analysis with Pajek

Exploratory Social Network Analysis with Pajek
Visualizing very large graphs using clustering neighborhoods

LPD'04 Proceedings of the 2004 international conference on Local Pattern Detection

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many data mining techniques are these days in use for ontology learning - text mining, Web mining, graph mining, link analysis, relational data mining, and so on. In the current state-of-the-art bundle there is a lack of "software mining" techniques. This term denotes the process of extracting knowledge out of source code. In this paper we approach the software mining task with a combination of text mining and link analysis techniques. We discuss how each instance (i.e. a programming construct such as a class or a method) can be converted into a feature vector that combines the information about how the instance is interlinked with other instances, and the information about its (textual) content. The so-obtained feature vectors serve as the basis for the construction of the domain ontology with OntoGen, an existing system for semi-automatic data-driven ontology construction.