Rough sets and information retrieval

  • Authors:
  • P. Das-Gupta

  • Affiliations:
  • Information Systems & Systems Engineering, George Mason University

  • Venue:
  • SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

The theory of rough sets was introduced [PAWLAK82]. It allows us to classify objects into sets of equivalent members based on their attributes. We may then examine any combination of the same objects (or even their attributes) using the resultant classification. The theory has direct applications in the design and evaluation of classification schemes and the selection of discriminating attributes. Pawlak's papers discuss its application in the domain of medical diagnostic systems. Here we apply it to the design of information retrieval systems accessing collections of documents. Advantages offered by the theory are: the implicit inclusion of Boolean logic; term weighting; and the ability to rank retrieved documents. In the first section we describe the theory. This is derived from the work by [PAWLAK84, PAWLAK82] and includes only the most relevant aspects of the theory. In the second we apply it to information retrieval. Specifically, we design the approximation space, search strategies as well as illustrate the application of relevance feedback to improve document indexing. Following this in section three we compare the rough set formalism to the Boolean, vector and fuzzy models of information retrieval. Finally we present a small scale evaluation of rough sets which indicates its potential in information retrieval.