Reviewing and Evaluating Automatic Term Recognition Techniques

Authors:
Ioannis Korkontzelos;Ioannis P. Klapaftis;Suresh Manandhar
Affiliations:
Department of Computer Science, The University of York, Heslington, York, UK YO10 5NG;Department of Computer Science, The University of York, Heslington, York, UK YO10 5NG;Department of Computer Science, The University of York, Heslington, York, UK YO10 5NG
Venue:
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Year:
2008

Citing 9
Cited 8

Word association norms, mutual information, and lexicography

Computational Linguistics
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Named Entity recognition without gazetteers

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Methods for the qualitative evaluation of lexical association measures

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Evaluation challenges in large-scale document summarization

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Collocation extraction based on modifiability statistics

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Combining association measures for collocation extraction

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Recognizing nested named entities in GENIA corpus

BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis

Facilitating design learning through faceted classification of in-service information

Advanced Engineering Informatics
Issues on quality assessment of SNOMED CT® subsets: term validation and term extraction

WBIE '09 Proceedings of the Workshop on Biomedical Information Extraction
Automatic domain terminology extraction using graph mutual reinforcement

WAIM'10 Proceedings of the 11th international conference on Web-age information management
Text mining for efficient search and assisted creation of clinical trials

Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
Enrichment and structuring of archival description metadata

LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
ASCOT: assisting search and creation of clinical trials

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
Mining pure high-order word associations via information geometry for information retrieval

ACM Transactions on Information Systems (TOIS)
Unsupervised mining of frequent tags for clinical eligibility text indexing

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic Term Recognition (ATR) is defined as the task of identifying domain specific terms from technical corpora. Termhood-basedapproaches measure the degree that a candidate term refers to a domain specific concept. Unithood-basedapproaches measure the attachment strength of a candidate term constituents. These methods have been evaluated using different, often incompatible evaluation schemes and datasets. This paper provides an overview and a thorough evaluation of state-of-the-art ATRmethods, under a common evaluation framework, i.e. corpora and evaluation method. Our contributions are two-fold: (1) We compare a number of different ATRmethods, showing that termhood-basedmethods achieve in general superior performance. (2) We show that the number of independent occurrences of a candidate term is the most effective source for estimating term nestedness, improving ATRperformance.