Full text document retrieval: Hebrew legal texts (report on the first phase of the responsa retrieval project)

  • Authors:
  • Y. Choueka;M. Cohen;J. Dueck;A. S. Fraenkel;M. Slae

  • Affiliations:
  • Bar-Ilan University;The Inter-Kibbutz Computer Center, Tel-Aviv;The Hebrew University of Jerusalem;The Weizmann Institute of Science, and Bar-Ilan University;The Weizmann Institute of Science, and Bar-Ilan University

  • Venue:
  • SIGIR '71 Proceedings of the 1971 international ACM SIGIR conference on Information storage and retrieval
  • Year:
  • 1971

Quantified Score

Hi-index 0.00

Visualization

Abstract

A full text retrieval system was designed for the responsa literature, which is a large corpus of Hebrew legal cases. The unique problems of the data base --- mixture of Hebrew, Aramaic and vernaculars, lack of vowels and punctuation, extreme language inflection problems, homographs, existence of thousands of grammatical variants of any given keyword --- dictated development of new methods. Among them we list "grammatical synthesis", which synthesizes all grammatical variants of a given keyword; "Compact KWIC", which enables the user to have a glimpse of the nature of the search before having performed it; effective citation index imbedded in full text searches; and, in general, extensive use of both positive and negative feedback within a single search run. A number of searches performed on a relatively small data base gave in each case a recall of 100%. The average precision was 34%. A KWIC of strategic portions of retrieved documents usually enables a quick disposal of non-relevant material.