Improving the performance of a named entity recognition system with knowledge acquisition

Authors:
Myung Hee Kim;Paul Compton
Affiliations:
The University of New South Wales, Sydney, NSW, Australia;The University of New South Wales, Sydney, NSW, Australia
Venue:
EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
Year:
2012

Citing 18
Cited 1

Learning dictionaries for information extraction by multi-level bootstrapping

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Named Entity recognition without gazetteers

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Combining distributional and morphological information for part of speech induction

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Japanese Named Entity extraction with redundant morphological analysis

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Self-supervised relation extraction from the Web

Knowledge and Information Systems
Development and Verification of Rule Based Systems -- A Survey of Developers

RuleML '08 Proceedings of the International Symposium on Rule Representation, Interchange and Reasoning on the Web
StatSnowball: a statistical approach to extracting entity relationships

Proceedings of the 18th international conference on World wide web
An Incremental Knowledge Acquisition Method for Improving Duplicate Invoices Detection

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Efficient Knowledge Acquisition for Extracting Temporal Relations

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Design challenges and misconceptions in named entity recognition

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Relation extraction from wikipedia using subtree mining

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Unsupervised named-entity extraction from the Web: An experimental study

Artificial Intelligence
RDRCE: combining machine learning and knowledge acquisition

PKAW'10 Proceedings of the 11th international conference on Knowledge management and acquisition for smart systems and services
Experience with long-term knowledge acquisition

Proceedings of the sixth international conference on Knowledge capture
RDR-based open IE for the web document

Proceedings of the sixth international conference on Knowledge capture
Recognizing named entities in tweets

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised named-entity recognition: generating gazetteers and resolving ambiguity

AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence

Situated cognition and knowledge acquisition research

International Journal of Human-Computer Studies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Named Entity Recognition (NER) is important for extracting information from highly heterogeneous web documents. Most NER systems have been developed based on formal documents, but informal web documents usually contain noise, and incorrect and incomplete expressions. The performance of current NER systems drops dramatically as informality increases in web documents and a different kind of NER is needed. Here we propose a Ripple-Down-Rules-based Named Entity Recognition (RDRNER) system. This is a wrapper around the machine-learning-based Stanford NER system, correcting its output using rules added by people to deal with specific application domains. The key advantages of this approach are that it can handle the freer writing style that occurs in web documents and correct errors introduced by the web's informal characteristics. In these studies the Ripple-Down Rule approach, with low-cost rule addition improved the Stanford NER system's performance on informal web document in a specific domain to the same level as its state-of-the-art performance on formal documents.