The Strength of Weak Learnability
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
Named Entity Extraction using AdaBoost
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition with a maximum entropy approach
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition through classifier combination
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Introduction to the bio-entity recognition task at JNLPBA
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Hungarian named entity recognition with a maximum entropy approach
Acta Cybernetica
GYDER: maxent metonymy resolution
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Researcher affiliation extraction from homepages
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
Improving a state-of-the-art named entity recognition system using the world wide web
ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Automatic free-text-tagging of online news archives
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Special semi-supervised techniques for natural language processing tasks
CIMMACS'07 Proceedings of the 6th WSEAS international conference on Computational intelligence, man-machine systems and cybernetics
Learning to detect english and hungarian light verb constructions
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 1
Hi-index | 0.00 |
In this paper we introduce a multilingual Named Entity Recognition (NER) system that uses statistical modeling techniques. The system identifies and classifies NEs in the Hungarian and English languages by applying AdaBoostM1 and the C4.5 decision tree learning algorithm. We focused on building as large a feature set as possible, and used a split and recombine technique to fully exploit its potentials. This methodology provided an opportunity to train several independent decision tree classifiers based on different subsets of features and combine their decisions in a majority voting scheme. The corpus made for the CoNLL 2003 conference and a segment of Szeged Corpus was used for training and validation purposes. Both of them consist entirely of newswire articles. Our system remains portable across languages without requiring any major modification and slightly outperforms the best system of CoNLL 2003, and achieved a 94.77% F measure for Hungarian. The real value of our approach lies in its different basis compared to other top performing models for English, which makes our system extremely successful when used in combination with CoNLL modells.