Using suffix arrays for efficiently recognition of named entities in large scale

Authors:
Benjamin Adrian;Sven Schwarz
Affiliations:
Knowledge Management Department, DFKI GmbH, Kaiserslautern, Germany;Knowledge Management Department, DFKI GmbH, Kaiserslautern, Germany
Venue:
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Year:
2011

Citing 5
Cited 0

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Evolving GATE to meet new challenges in language engineering

Natural Language Engineering
Text chunking using regularized Winnow

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Linear work suffix array construction

Journal of the ACM (JACM)
Ontology-based information extraction: An introduction and a survey of current approaches

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present an efficient comparison of text and RDF data for recognizing named entities. Here, a named entity is a text sequence that refers to a URI reference within an RDF graph. We present suffix arrays as representation format for text and a relational database scheme to represent SemanticWeb data. Using these representation facilities performs a named entity recognition in linear time complexity and without the requirement to hold names of existing entities in memory. Both is needed to implement a named entity recognition on the scale of for instance the DBpedia database.