Multilingual resources for entity extraction

Authors:
Stephanie Strassel;Alexis Mitchell;Shudong Huang
Affiliations:
Linguistic Data Consortium, Philadelphia, PA;Linguistic Data Consortium, Philadelphia, PA;Linguistic Data Consortium, Philadelphia, PA
Venue:
MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Year:
2003

Citing 1
Cited 2

A formal framework for linguistic annotation

Speech Communication - Special issue on speech annotation and corpus tools

Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Investigating the effects of selective sampling on the annotation task

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Progress in human language technology requires increasing amounts of data and annotation in a growing variety of languages. Research in Named Entity extraction is no exception. Linguistic Data Consortium is creating annotated corpora to support information extraction in English, Chinese, Arabic, and other languages for a variety of US Government-sponsored programs. This paper covers the scope of annotation and research tasks within these programs, describes some of the challenges of multilingual corpus development for entity extraction, and concludes with a description of the corpora developed to support this research.