CLASCN: candidate network selection for efficient top-k keyword queries over databases

  • Authors:
  • Jun Zhang;Zhao-Hui Peng;Shan Wang;Hui-Jing Nie

  • Affiliations:
  • School of Information, Renmin University of China, Beijing, China and Key Laboratory of Data Engineering and Knowledge Engineering, Ministry of Education, Beijing, China and Computer Science and T ...;School of Information, Renmin University of China, Beijing, China and Key Laboratory of Data Engineering and Knowledge Engineering, Ministry of Education, Beijing, China;School of Information, Renmin University of China, Beijing, China and Key Laboratory of Data Engineering and Knowledge Engineering, Ministry of Education, Beijing, China;School of Information, Renmin University of China, Beijing, China and Key Laboratory of Data Engineering and Knowledge Engineering, Ministry of Education, Beijing, China

  • Venue:
  • Journal of Computer Science and Technology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keyword Search Over Relational Databases (KSORD) enables casual or Web users easily access databases through free-form keyword queries. Improving the performance of KSORD systems is a critical issue in this area. In this paper, a new approach CLASCN (Classification, Learning And Selection of Candidate Network) is developed to efficiently perform top-k keyword queries in schema-graph-based online KSORD systems. In this approach, the Candidate Networks (CNs) from trained keyword queries or executed user queries are classified and stored in the databases, and top-k results from the CNs are learned for constructing CN Language Models (CNLMs). The CNLMs are used to compute the similarity scores between a new user query and the CNs from the query. The CNs with relatively large similarity score, which are the most promising ones to produce top-k results, will be selected and performed. Currently, CLASCN is only applicable for past queries and New All-keyword-Used (NAU) queries which are frequently submitted queries. Extensive experiments also show the efficiency and effectiveness of our CLASCN approach.