An OODBMS-IRS Integration Based on a Statistical Corpus Extraction Method for Document Management

  • Authors:
  • Chung-Hong Lee;Lee-Feng Chien

  • Affiliations:
  • -;-

  • Venue:
  • DEXA '99 Proceedings of the 10th International Conference on Database and Expert Systems Applications
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

The maintenance cost is a critical issue for the success of integrating database and information retrieval systems (IRS). For a robust integration of search engines, the signature file filter can effectively eliminate the maintenance cost and offer a more natural fit between the database and text retrieval systems. Extending the usability of merged database and signature based text-retrieval systems by building on an object-oriented database management system (OODBMS) provides better and complementary advantages to both databases and information retrieval systems (IRSs). In this paper, we present a new approach for integrating OODBMSs and IRSs that maintains the flexibility and avoids overheads of mapping process, by means of encapsulating the documents and signature based IR methods into storable objects which are being stored in the database. In addition, we develop a novel signature file approach based on a statistical corpus extraction technique, which can effectively reduce false drop probability for text retrieval from the underneath document database.