Federating Diverse Collections of Scientific Literature

  • Authors:
  • Bruce Schatz;William H. Mischo;Timothy W. Cole;Joseph B. Hardin;Ann P. Bishop;Hsinchun Chen

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • Computer
  • Year:
  • 1996

Quantified Score

Hi-index 4.10

Visualization

Abstract

The Digital Library Initiative (DLI) project at the University of Illinois at Urbana-Champaign is developing the information infrastructure to effectively search technical documents on the Internet. The authors are constructing a large testbed of scientific literature, evaluating its effectiveness under significant use, and researching enhanced search technology. They are building repositories (organized collections) of indexed multiple-source collections and federating (merging and mapping) them by searching the material via multiple views of a single virtual collection. Developing widely usable Web technology is also a key goal. Improving Web search beyond full-text retrieval will require using document structure in the short term and document semantics in the long term. Their testbed efforts concentrate on journal articles from the scientific literature, with structure specified by the Standard Generalized Markup Language (SGML). Research efforts extract semantics from documents using the scalable technology of concept spaces based on context frequency. They then merge these efforts with traditional library indexing to provide a single Internet interface to indexes of multiple repositories.