Full Length Article: Simulated annealing based classifier ensemble techniques: Application to part of speech tagging

Authors:
Asif Ekbal;Sriparna Saha
Affiliations:
Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India;Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India
Venue:
Information Fusion
Year:
2013

Citing 29
Cited 0

Back propagation is sensitive to initial conditions

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Original Contribution: Stacked generalization

Neural Networks
The nature of statistical learning theory

The nature of statistical learning theory
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Support-Vector Networks

Machine Learning
Bagging predictors

Machine Learning
Feature selection in SVM text categorization

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Multi-Objective Optimization Using Evolutionary Algorithms

Multi-Objective Optimization Using Evolutionary Algorithms
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Tagging English text with a probabilistic model

Computational Linguistics
TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Named entity recognition through classifier combination

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Morphological richness offsets resource demand- experiences in constructing a POS tagger for Hindi

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Solving multiclass learning problems via error-correcting output codes

Journal of Artificial Intelligence Research
Voted NER system using appropriate unlabeled data

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Simple semi-supervised training of part-of-speech taggers

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Efficient graph-based semi-supervised learning of structured tagging models

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Part-of-speech tagging from 97% to 100%: is it time for some linguistics?

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Unsupervised part-of-speech tagging with bilingual graph-based projections

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A fast and elitist multiobjective genetic algorithm: NSGA-II

IEEE Transactions on Evolutionary Computation
A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA

IEEE Transactions on Evolutionary Computation
Dominance-Based Multiobjective Simulated Annealing

IEEE Transactions on Evolutionary Computation
Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Part-of-Speech (PoS) tagging is an important pipelined module for almost all Natural Language Processing (NLP) application areas. In this paper we formulate PoS tagging within the frameworks of single and multi-objective optimization techniques. At the very first step we propose a classifier ensemble technique for PoS tagging using the concept of single objective optimization (SOO) that exploits the search capability of simulated annealing (SA). Thereafter we devise a method based on multiobjective optimization (MOO) to solve the same problem, and for this a recently developed multiobjective simulated annealing based technique, AMOSA, is used. The characteristic features of AMOSA are its concepts of the amount of domination and archive in simulated annealing, and situation specific acceptance probabilities. We use Conditional Random Field (CRF) and Support Vector Machine (SVM) as the underlying classification methods that make use of a diverse set of features, mostly based on local contexts and orthographic constructs. We evaluate our proposed approaches for two Indian languages, namely Bengali and Hindi. Evaluation results of the single objective version shows the overall accuracy of 88.92% for Bengali and 87.67% for Hindi. The MOO based ensemble yields the overall accuracies of 90.45% and 89.88% for Bengali and Hindi, respectively.