Using N-best lists for named entity recognition from Chinese speech

Authors:
Lufeng Zhai;Pascale Fung;Richard Schwartz;Marine Carpuat;Dekai Wu
Affiliations:
University of Science and Technology, Clear Water Bay, Hong Kong;University of Science and Technology, Clear Water Bay, Hong Kong;BBN Technologies, Columbia, MD;University of Science and Technology, Clear Water Bay, Hong Kong;University of Science and Technology, Clear Water Bay, Hong Kong
Venue:
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Year:
2004

Citing 5
Cited 8

Chinese named entity identification using class-based language model

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A stacked, voted, stacked model for named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
HowtogetaChineseName(Entity): segmentation and combination issues

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing

Improving name tagging by reference resolution and relation detection

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Resume information extraction with cascaded hybrid model

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Incorporating speech recognition confidence into discriminative named entity recognition of speech data

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Multi-level bootstrapping for extracting parallel sentences from a quasi-comparable corpus

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Robust named entity extraction from large spoken archives

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Analysis and repair of name tagger errors

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Leveraging word confusion networks for named entity modeling and detection from conversational telephone speech

Speech Communication
Speech for Content Creation

International Journal of Mobile Human Computer Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present the first known result for named entity recognition (NER) in realistic large-vocabulary spoken Chinese. We establish this result by applying a maximum entropy model, currently the single best known approach for textual Chinese NER, to the recognition output of the BBN LVCSR system on Chinese Broadcast News utterances. Our results support the claim that transferring NER approaches from text to spoken language is a significantly more difficult task for Chinese than for English. We propose re-segmenting the ASR hypotheses as well as applying post-classification to improve the performance. Finally, we introduce a method of using n-best hypotheses that yields a small but nevertheless useful improvement NER accuracy. We use acoustic, phonetic, language model, NER and other scores as confidence measure. Experimental results show an average of 6.7% relative improvement in precision and 1.7% relative improvement in F-measure.