The C-value/NC-value Method of Automatic Recognition for Multi-Word Terms

Authors:
Katerina T. Frantzi;Sophia Ananiadou;Jun-ichi Tsujii
Affiliations:
-;-;-
Venue:
ECDL '98 Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries
Year:
1998

Citing 13
Cited 13

A corpus-based approach to language learning

A corpus-based approach to language learning
Generating and evaluating domain-oriented multi-word terms from texts

Information Processing and Management: an International Journal
Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Termight: identifying and translating technical terminology

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Incorporating context information for the extraction of terms

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Similarity-based estimation of word cooccurrence probabilities

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A methodology for automatic term recognition

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Towards automatic extraction of monolingual and bilingual terminology

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Surface grammatical analysis for the extraction of terminological noun phrases

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Extracting nested collocations

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1

Ontology Learning and Reasoning -- Dealing with Uncertainty and Inconsistency

Uncertainty Reasoning for the Semantic Web I
Concept Extraction and Clustering for Topic Digital Library Construction

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Improving accuracy of identifying clinical concepts in noisy unstructured clinical notes using existing internal redundancy

AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Knowledge-based interaction in software development

Intelligent Decision Technologies
The ACL Anthology Searchbench

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations
Ontology acquisition for automatic building of scientific portals

SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Text2Onto: a framework for ontology learning and data-driven change discovery

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Investigating keyphrase indexing with text denoising

Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Extracting glossary sentences from scholarly articles: a comparative evaluation of pattern bootstrapping and deep analysis

ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
Identifying well-formed biomedical phrases in MEDLINE® text

Journal of Biomedical Informatics
Domain taxonomy learning from text: The subsumption method versus hierarchical clustering

Data & Knowledge Engineering
Combining Supervised Learning Techniques to Key-Phrase Extraction for Biomedical Full-Text

International Journal of Intelligent Information Technologies
Semi-automatic construction of domain ontology for agent reasoning

Personal and Ubiquitous Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Technical terms (henceforth called simply terms), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora. The method, (C-value/NC-value), combines linguistic and statistical information. The first part, C-value enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type of multi-word terms, the nested terms. The second part, NC-value, gives: 1) a method for the extraction of term context words (words that tend to appear with terms), 2) the incorporation of information from term context words to the extraction of terms.