Bilingual legal document retrieval and management using XML

  • Authors:
  • R. W. P. Luk;B. K. Y. T'sou;T. B. Y. Lai;O. O. Y. Kwong;F. C. Y. Chik;L. Y. L. Cheung

  • Affiliations:
  • Department of Computing, Hong Kong Polytechnic University, Hung Homg, Kowloon, Hong Kong;Language Information Science Research Center, City University of Hong Kong, 83 Tat Chee Avenue, Yau Yat Chuen, Kowloon, Hong Kong;Language Information Science Research Center, City University of Hong Kong, 83 Tat Chee Avenue, Yau Yat Chuen, Kowloon, Hong Kong;Language Information Science Research Center, City University of Hong Kong, 83 Tat Chee Avenue, Yau Yat Chuen, Kowloon, Hong Kong;Language Information Science Research Center, City University of Hong Kong, 83 Tat Chee Avenue, Yau Yat Chuen, Kowloon, Hong Kong;Language Information Science Research Center, City University of Hong Kong, 83 Tat Chee Avenue, Yau Yat Chuen, Kowloon, Hong Kong

  • Venue:
  • Software—Practice & Experience
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In certain bilingual and multi-lingual societies, translated legal documents are as important as the original legal documents because they have the same legal status as the originals. However, there is little reported work on the retrieval and management of bilingual legal documents. We describe the design and development of a bilingual document retrieval and management prototype, called ELDoS, which is used by court interpreters and judges from the Hong Kong Judiciary. Since the speed of retrieval is a major concern for user acceptance, and therefore for widespread deployment of the system, the architecture of the prototype is designed to balance the workload of the client and server. Extensible Markup Language (XML) is used to mark up the bilingual legal documents for a variety of document retrieval and management tasks. XML enables the use of XML Stylesheet Language Transformation (XSLT) to align bilingual data in the client, instead of the server, and improve alignment speed linearly with respect to the size of the document, using a high-end PC, when the server has no concurrent access. The design of the interface was continually improved after extensive consultation with court interpreters and after the user acceptance tests. In our evaluation, the facilities for highlighting translated terms have a macro-averaged precision of 90+ % and a macro-average recall of 80+ %, which were considered acceptable by our users. We believe that the experience in the design and development of this prototype is applicable to other language pairs as well as to other domains.