Combining structure and function-based descriptors for component retrieval in software digital libraries

  • Authors:
  • Yuhanis Yusof;Omer F. Rana

  • Affiliations:
  • Graduate Department of Computer Sciences, College of Arts and Sciences, Universiti Utara Malaysia, 06010 UUM Sintok, Kedah, Malaysia;(Correspd. E-mail: o.f.rana@cs.cardiff.ac.uk) School of Computer Science, Cardiff University, Cardiff CF24 3AA, Wales, UK

  • Venue:
  • Integrated Computer-Aided Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In software development, it is often desirable to re-use existing software components. Currently, a number of repositories of components are available - generally including program source code, but finding the ones that can be re-used for an application is a challenging task. Program source code may be viewed as a form of data, containing both structure and function; it is therefore important to make use of this information in representing programs in the repository. We propose to combine the functional and structural information to facilitate software component search and retrieval. The proposed model reveals how functional and structural descriptors are identified and combined into a single representation. The functional descriptors are identified by extracting selected terms from program source code and a weighting scheme is adopted to differentiate the importance of terms. Structural descriptors that comprise of information generated based on structural relationships, such as design patterns and software metrics, are extracted from a program to be added as program descriptors. In order to retrieve components that are relevant to a given query, the use of similarity measurement based on the vector model and data distribution are investigated. The experiments undertaken on program retrieval indicate that the use of a combination of functional and structural descriptors is better than using functional descriptors on their own. Furthermore, programs retrieved using the proposed approach are less complex and easy to maintain.