Collaborative research - digital government: a language modeling approach to metadata for cross-database linkage and search

  • Authors:
  • W. Bruce Croft;Jamie Callan

  • Affiliations:
  • University of Massachusetts Amherst;Carnegie Mellon University

  • Venue:
  • dg.o '04 Proceedings of the 2004 annual national conference on Digital government research
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This research demonstrates that language models are a sound and effective foundation on which to build large-scale, distributed information systems for government applications. It contributes to providing an alternative to human-generated metadata for locating information resources. Manual indexing is expensive, and studies show that people are inconsistent and inaccurate when doing indexing, which leads to poor retrieval effectiveness. Generating content descriptions automatically from the markup and structure of documents is less expensive and, when coupled with good search techniques, can be used to locate relevant information more consistently. The evaluation testbeds for our research have been government databases such as those found in FedStats and GPO.