Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective

  • Authors:
  • Susan L. Price;Marianne Lykke Nielsen;Lois M. L. Delcambre;Peter Vedsted;Jeremy Steinhauer

  • Affiliations:
  • Department of Computer Science, Portland State University, P.O. Box 751, Portland, OR 97207-0751, USA;Royal School of Library & Information Sciences, Aalborg, Denmark;Department of Computer Science, Portland State University, P.O. Box 751, Portland, OR 97207-0751, USA;The Research Unit for General Practice, University of Aarhus, Aarhus, Denmark;Department of Computer Science, Portland State University, P.O. Box 751, Portland, OR 97207-0751, USA

  • Venue:
  • Information Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We seek to leverage an expert user's knowledge about how information is organized in a domain and how information is presented in typical documents within a particular domain-specific collection, to effectively and efficiently meet the expert's targeted information needs. We have developed the semantic components model to describe important semantic content within documents. The semantic components model for a given collection (based on a general understanding of the type of information needs expected) consists of a set of document classes, where each class has an associated set of semantic components. Each semantic component instance consists of segments of text about a particular aspect of the main topic of the document and may not correspond to structural elements in the document. The semantic components model represents document content in a manner that is complementary to full text and keyword indexing. This paper describes how the semantic components model can be used to improve an information retrieval system. We present experimental evidence from a large interactive searching study that compared the use of semantic components in a system with full text and keyword indexing, where we extended the query language to allow users to search using semantic components, to a base system that did not have semantic components. We evaluate the systems from a system perspective, where semantic components were shown to improve document ranking for precision-oriented searches, and from a user perspective. We also evaluate the systems from a session-based perspective, evaluating not only the results of individual queries but also the results of multiple queries during a single interactive query session.