Enhancing Concept-Based Retrieval Based onMinimal Term Sets

  • Authors:
  • A. H. Alsaffar;J. S. Deogun;V. V. Raghavan;H. Sever

  • Affiliations:
  • Department of Computer Science & Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA;Department of Computer Science & Engineering, University of Nebraska-Lincoln, Lincoln, NE 68588, USA;The Center for Advanced Computer Studies, University of Louisiana-Lafayette, Lafayette, LA 70504, USA;Department of Computer Science & Engineering, Hacettepe University, 06532 Beytepe, Ankara, Turkey

  • Venue:
  • Journal of Intelligent Information Systems - Special issue on methodologies for intelligent information systems
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

There is considerable interest in bridging theterminological gap that exists between the way users prefer tospecify their information needs and the way queries are expressed interms of keywords or text expressions that occur in documents. One ofthe approaches proposed for bridging this gap is based ontechnologies for expert systems. The central idea of such anapproach was introduced in the context of a system called Rule BasedInformation Retrieval by Computer (RUBRIC). In RUBRIC, user querytopics (or concepts) are captured in a rule base represented by anAND/OR tree. The evaluation of AND/OR tree is essentially based onminimum and maximum weights of query terms for conjunctions anddisjunctions, respectively. The time to generate the retrieval outputof AND/OR tree for a given query topic is exponential in number ofconjunctions in the DNF expression associated with the query topic.In this paper, we propose a new approach for computing the retrievaloutput. The proposed approach involves preprocessing of the rule baseto generate Minimal Term Sets (MTSs) that speed up the retrievalprocess. The computational complexity of the on-line query evaluationfollowing the preprocessing is polynomial in m. We show that thecomputation and use of MTSs allows a user to choose query topics thatbest suit their needs and to use retrieval functions that yield amore refined and controlled retrieval output than is possible withthe AND/OR tree when document terms are binary. We incorporatep-Norm model into the process of evaluating MTSs to handle the casewhere weights of both documents and query terms are non-binary.