A maximum entropy approach to natural language processing
Computational Linguistics
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
The NYU system for MUC-6 or where's the syntax?
MUC6 '95 Proceedings of the 6th conference on Message understanding
Tuning support vector machines for biomedical named entity recognition
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Effective adaptation of a Hidden Markov Model-based named entity recognizer for biomedical domain
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Introduction to the bio-entity recognition task at JNLPBA
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Exploiting context for biomedical entity recognition: from syntax to the web
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Exploring deep knowledge resources in biomedical name recognition
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
POSBIOTM-NER in the shared task of BioNLP/NLPBA 2004
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Biomedical named entity recognition using conditional random fields and rich feature sets
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Expert Systems with Applications: An International Journal
Survey and taxonomy of feature selection algorithms in intrusion detection system
Inscrypt'06 Proceedings of the Second SKLOIS conference on Information Security and Cryptology
Biomedical named entity recognition: a poor knowledge HMM-based approach
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Guest Editorial: Current issues in biomedical text mining and natural language processing
Journal of Biomedical Informatics
A composite kernel for named entity recognition
Pattern Recognition Letters
Boosting performance of gene mention tagging system by hybrid methods
Journal of Biomedical Informatics
Active learning technique for biomedical named entity extraction
Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Methodological Review: Biomedical text mining and its applications in cancer research
Journal of Biomedical Informatics
Topic-Oriented words as features for named entity recognition
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Stacked ensemble coupled with feature selection for biomedical entity extraction
Knowledge-Based Systems
Hi-index | 0.00 |
Named entity recognition is an extremely important and fundamental task of biomedical text mining. Biomedical named entities include mentions of proteins, genes, DNA, RNA, etc which often have complex structures, but it is challenging to identify and classify such entities. Machine learning methods like CRF, MEMM and SVM have been widely used for learning to recognize such entities from an annotated corpus. The identification of appropriate feature templates and the selection of the important feature values play a very important role in the success of these methods. In this paper, we provide a study on word clustering and selection based feature reduction approaches for named entity recognition using a maximum entropy classifier. The identification and selection of features are largely done automatically without using domain knowledge. The performance of the system is found to be superior to existing systems which do not use domain knowledge.