Machine Learning
Internal and external evidence in the identification and semantic categorization of proper names
Corpus processing for lexical acquisition
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A hybrid approach for named entity and sub-type tagging
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Named entity chunking techniques in supervised learning for Japanese named entity recognition
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
The NYU system for MUC-6 or where's the syntax?
MUC6 '95 Proceedings of the 6th conference on Message understanding
Japanese named entity recognition based on a simple rule generator and decision tree learning
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Named entity recognition using an HMM-based chunk tagger
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Use of support vector machines in extended named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Can corpus based measures be used for comparative study of languages?
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Hi-index | 0.01 |
We have taken up the issue of named entity recognition of Indian languages by presenting a comparative study of two sequential learning algorithms viz. Conditional Random Fields (CRF) and Support Vector Machine (SVM). Though we only have results for Hindi, the features used are language independent, and hence the same procedure could be applied to tag the named entities in other Indian languages like Telgu, Bengali, Marathi etc. that have same number of vowels and consonants. We have used CRF++ for implementing CRF algorithm and Yamcha for implementing SVM algorithm. The results show a superiority of CRF over SVM and are just a little lower than the highest results achieved for this task. This can be attributed to the non-usage of any pre-processing and post-processing steps. The system makes use of the contextual information of words along with various language independent features to label the Named Entities (NEs).