An integrated term-based corpus query system

  • Authors:
  • Irena Spasic;Goran Nenadic;Kostas Manios;Sophia Ananiadou

  • Affiliations:
  • University of Salford;UMIST;University of Salford;University of Salford

  • Venue:
  • EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we describe the X-TRACT workbench, which enables efficient term-based querying against a domain-specific literature corpus. Its main aim is to aid domain specialists in locating and extracting new knowledge from scientific literature corpora. Before querying, a corpus is automatically terminologically analysed by the ATRACT system, which performs terminology recognition based on the C/NC-value method enhanced by incorporation of term variation handling. The results of terminology processing are annotated in XML, and the produced XML documents are stored in an XML-native database. All corpus retrieval operations are performed against this database using an XML query language. We illustrate the way in which the X-TRACT workbench can be utilised for knowledge discovery, literature mining and conceptual information extraction.