Class-based n-gram models of natural language
Computational Linguistics
The nature of statistical learning theory
The nature of statistical learning theory
A maximum entropy approach to named entity recognition
A maximum entropy approach to named entity recognition
Text classification using string kernels
The Journal of Machine Learning Research
ACM Transactions on Asian Language Information Processing (TALIP)
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Extracting the names of genes and gene products with a hidden Markov model
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Hierarchical clustering of words
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Fast String Kernels using Inexact Matching for Protein Sequences
The Journal of Machine Learning Research
Biomedical named entity recognition using two-phase model based on SVMs
Journal of Biomedical Informatics - Special issue: Named entity recognition in biomedicine
Efficient support vector classifiers for named entity recognition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Chunking with support vector machines
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Use of support vector learning for chunk identification
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Tuning support vector machines for biomedical named entity recognition
BioMed '02 Proceedings of the ACL-02 workshop on Natural language processing in the biomedical domain - Volume 3
Use of support vector machines in extended named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Effective adaptation of a Hidden Markov Model-based named entity recognizer for biomedical domain
BioMed '03 Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine - Volume 13
Fast and space efficient string kernels using suffix arrays
ICML '06 Proceedings of the 23rd international conference on Machine learning
Length-weighted string kernels for sequence data classification
Pattern Recognition Letters
Experimental Study on a Two Phase Method for Biomedical Named Entity Recognition
IEICE - Transactions on Information and Systems
Introduction to the bio-entity recognition task at JNLPBA
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Exploiting context for biomedical entity recognition: from syntax to the web
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Exploring deep knowledge resources in biomedical name recognition
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
POSBIOTM-NER in the shared task of BioNLP/NLPBA 2004
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Biomedical named entity recognition using conditional random fields and rich feature sets
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Feature selection techniques for maximum entropy based biomedical named entity recognition
Journal of Biomedical Informatics
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Expert Systems with Applications: An International Journal
Biomedical named entity recognition: a poor knowledge HMM-based approach
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Expert Systems with Applications: An International Journal
Hi-index | 0.10 |
In this paper, we propose a novel kernel function for support vector machines (SVM) that can be used for sequential labeling tasks like named entity recognition (NER). Machine learning methods like support vector machines, maximum entropy, hidden Markov model and conditional random fields are the most widely used methods for implementing NER systems. The features used in machine learning algorithms for NER are mostly string based features. The proposed kernel is based on calculating a novel distance function between the string based features. In tasks like NER, the similarity between the contexts as well as the semantic similarity between the words play an important role. The goal is to capture the context and semantic information in NER like tasks. The proposed distance function makes use of certain statistics primarily derived from the training data and hierarchical clustering information. The kernel function is applied to the Hindi and biomedical NER tasks and the results are quite promising.