Lattice-based word identification in CLARE

Authors:
David M. Carter
Affiliations:
SRI International, Cambridge Computer Science Research Centre, Cambridge, U.K.
Venue:
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Year:
1992

Citing 9
Cited 3

TEAM: an experiment in the design of transportable natural-language interfaces

Artificial Intelligence
Resolving Quasi Logical Forms

Computational Linguistics
Automatic spelling correction in scientific and scholarly text

Communications of the ACM
Triphone analysis: a combined method for the correction of orthographical and typographical errors

ANLC '88 Proceedings of the second conference on Applied natural language processing
Lexical acquisition in the Core Language Engine

EACL '89 Proceedings of the fourth conference on European chapter of the Association for Computational Linguistics
A Kana-Kanji translation system for non-segmented input sentences based on syntactic and semantic analysis

COLING '86 Proceedings of the 11th coference on Computational linguistics
Knowledge integration in a robust and efficient morpho-syntactic analyzer for French

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1
Morphosyntactic correction in natural language interfaces

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 2
A spelling correction program based on a noisy channel model

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2

Techniques for automatically correcting words in text

ACM Computing Surveys (CSUR)
Rapid development of morphological descriptions for full language processing systems

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
A speech to speech translation system built from standard components

HLT '93 Proceedings of the workshop on Human Language Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

I argue that because of spelling and typing errors and other properties of typed text, the identification of words and word boundaries in general requires syntactic and semantic knowledge. A lattice representation is therefore appropriate for lexical analysis. I show how the use of such a representation in the CLARE system allows different kinds of hypothesis about word identity to be integrated in a uniform framework. I then describe a quantitative evaluation of CLARE's performance on a set of sentences into which typographic errors have been introduced. The results show that syntax and semantics can be applied as powerful sources of constraint on the possible corrections for misspelled words.