Refining component description by leveraging user query logs

Authors:
Yan Li;Lu Zhang;Bing Xie;Jiasu Sun
Affiliations:
Software Institute, School of Electronic Engineering and Computer Science, Peking University, Key Laboratory of High Confidence Software Technologies, Ministry of Education, Beijing 100871, PR Chi ...;Software Institute, School of Electronic Engineering and Computer Science, Peking University, Key Laboratory of High Confidence Software Technologies, Ministry of Education, Beijing 100871, PR Chi ...;Software Institute, School of Electronic Engineering and Computer Science, Peking University, Key Laboratory of High Confidence Software Technologies, Ministry of Education, Beijing 100871, PR Chi ...;Software Institute, School of Electronic Engineering and Computer Science, Peking University, Key Laboratory of High Confidence Software Technologies, Ministry of Education, Beijing 100871, PR Chi ...
Venue:
Journal of Systems and Software
Year:
2009

Citing 26
Cited 0

An Information Retrieval Approach for Automatically Constructing Software Libraries

IEEE Transactions on Software Engineering
Retrieving reusable software by sampling behavior

ACM Transactions on Software Engineering and Methodology (TOSEM)
Using statistical testing in the evaluation of retrieval experiments

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Signature matching: a tool for using software libraries

ACM Transactions on Software Engineering and Methodology (TOSEM)
An evolutionary approach to constructing effective software reuse repositories

ACM Transactions on Software Engineering and Methodology (TOSEM)
Specification matching of software components

ACM Transactions on Software Engineering and Methodology (TOSEM)
Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A Learning Agent that Assists the Browsing of Software Libraries

IEEE Transactions on Software Engineering
Clustering user queries of a search engine

Proceedings of the 10th international conference on World Wide Web
Defining and Applying Measures of Distance Between Specifications

IEEE Transactions on Software Engineering
Modern Information Retrieval

Modern Information Retrieval
Supporting reuse by delivering task-relevant and personalized information

Proceedings of the 24th International Conference on Software Engineering
A survey of software reuse libraries

Annals of Software Engineering
Using Iterative Refinement to Find Reusable Software

IEEE Software
Reusing Software: Issues and Research Directions

IEEE Transactions on Software Engineering
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An Overview of JB (Jade Bird) Component Library System JBCL

TOOLS '97 Proceedings of the Technology of Object-Oriented Languages and Systems-Tools - 24
Query Expansion by Mining User Logs

IEEE Transactions on Knowledge and Data Engineering
A semantic-based approach to component retrieval

ACM SIGMIS Database
Relevancy based semantic interoperation of reuse repositories

Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Learning to Rank

Information Retrieval
Ranking Significance of Software Components Based on Use Relations

IEEE Transactions on Software Engineering
MUDABlue: an automatic categorization system for open source repositories

Journal of Systems and Software - Special issue: Selected papers from the 11th Asia Pacific software engineering conference (APSEC 2004)
Mining User Query Logs to Refine Component Description

COMPSAC '07 Proceedings of the 31st Annual International Computer Software and Applications Conference - Volume 01
Classifying Software for Reusability

IEEE Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

How to help reusers retrieve components efficiently and conveniently is critical to the success of the component-based software development (CBSD). In the literature, many research efforts have been devoted to the improvement of component retrieval mechanisms. Although various retrieval methods have been proposed, nowadays retrieving software component by the description text is still prevalent in most real-world scenarios. Therefore, the quality of the component description text is vital for the component retrieval. Unfortunately, the descriptions of components often contain improper or even noisy information which could deteriorate the effectiveness of the retrieval mechanism. To alleviate the problem, in this paper, we propose an approach which can improve the component description by leveraging user query logs. The key idea of our approach is to refine the description of a component by extracting proper information from the user query logs. Two different strategies are proposed to carry out the information extraction. The first strategy extracts information for a component only from its own related query logs. Whereas our second strategy further takes logs from similar components into consideration. We performed an experimental study on two different data sets to evaluate the effectiveness of our approach. The experimental results demonstrate that by using either extraction strategy our approach can improve retrieval performance and our approach can be more effective by leveraging the second strategy which utilizes logs from similar components.