Applications of term identification technology: domain description and content characterisation

Authors:
Branimir Boguraev;Christopher Kennedy
Affiliations:
IBM T.J. Watson Research Center, IBM Corporation, NY, USA/ e-mail: bkb@watson.ibm.com;Department of Linguistics, Northwestern University, IL, USA/ e-mail: kennedy@ling.nwu.edu
Venue:
Natural Language Engineering
Year:
1999

Citing 12
Cited 5

An algorithm for pronominal anaphora resolution

Computational Linguistics
Corpus processing for lexical acquisition

Corpus processing for lexical acquisition
Identifying unknown proper names in newswire text

Corpus processing for lexical acquisition
Distinguished usage

Corpus processing for lexical acquisition
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text

Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Slot Grammar: A System for Simpler Construction of Practical Natural Language Grammars

Proceedings of the International Symposium on Natural Language and Logic
Termight: identifying and translating technical terminology

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
An automated system that assists in the generation of document indexes

Natural Language Engineering
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Syntactic approaches to automatic book indexing

ACL '88 Proceedings of the 26th annual meeting on Association for Computational Linguistics
Surface grammatical analysis for the extraction of terminological noun phrases

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Anaphora for everyone: pronominal anaphora resoluation without a parser

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1

LiveInfo: Adapting Web Experience by Customization and Annotation

AH '00 Proceedings of the International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems
Knowledge management technology

IBM Systems Journal
Improved automatic keyword extraction given more linguistic knowledge

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A nonparametric method for extraction of candidate phrasal terms

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
“Without the clutter of unimportant words”: Descriptive keyphrases for text visualization

ACM Transactions on Computer-Human Interaction (TOCHI)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The identification and extraction of technical terms is one of the better understood and most robust Natural Language Processing (NLP) technologies within the current state of the art of language engineering. In generic information management contexts, terms have been used primarily for procedures seeking to identify a set of phrases that is useful for tasks such as text indexing, computational lexicology, and machine-assisted translation: such tasks make important use of the assumption that terminology is representative of a given domain. This paper discusses an extension of basic terminology identification technology for the application to two higher level semantic tasks: domain description, the specification of the technical domain of a document, and content characterisation, the construction of a compact, coherent and useful representation of the topical content of a text. With these extensions, terminology identification becomes the foundation of an operational environment for document processing and content abstraction.