Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Sequential conditional Generalized Iterative Scaling
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Incorporating non-local information into information extraction systems by Gibbs sampling
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Geographically-aware information retrieval for collections of digitized historical maps
Proceedings of the 4th ACM workshop on Geographical information retrieval
Foundations and Trends in Databases
Named entity recognition in query
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Creating relational data from unstructured and ungrammatical data sources
Journal of Artificial Intelligence Research
Using search session context for named entity recognition in query
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A metadata geoparsing system for place name recognition and resolution in metadata records
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Text Processing with GATE
Microblog-genre noise and impact on semantic annotation accuracy
Proceedings of the 24th ACM Conference on Hypertext and Social Media
Hi-index | 0.00 |
This paper describes an approach for the task of named entity recognition in structured data containing free text as the values of its elements. We studied the recognition of the entity types of person, location and organization in bibliographic data sets from a concrete wide digital library initiative. Our approach is based on conditional random fields models, using features designed to perform named entity recognition in the absence of strong lexical evidence, and exploiting the semantic context given by the data structure. The evaluation results support that, with the specialized features, named entity recognition can be done in free text within structured data with an acceptable accuracy. Our approach was able to achieve a maximum precision of 0.91 at 0.55 recall and a maximum recall of 0.82 at 0.77 precision. The achieved results were always higher than those obtained with Stanford Named Entity Recognizer, which was developed for grammatically well-formed text. We believe this level of quality in named entity recognition allows the use of this approach to support a wide range of information extraction applications in structured data.