Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Mining knowledge from text using information extraction
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity normalization in user generated content
Proceedings of the second workshop on Analytics for noisy unstructured text data
Empirical study on the performance stability of named entity recognition model across domains
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Chinese named entity recognition based on multilevel linguistic features
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Hi-index | 0.00 |
We used the memory-based learner Timbl (Daelemans et al., 2002) to find names in English and German newspaper text. A first system used only the training data, and a number of gazetteers. The results show that gazetteers are not beneficial in the English case, while they are for the German data. Type-token generalization was applied, but also reduced performance. The second system used gazetteers derived from the unannotated corpus, as well as the ratio of capitalized versus uncapitalized use of each word. These strategies gave an increase in performance.