Original Contribution: Stacked generalization
Neural Networks
Decision Combination in Multiple Classifier Systems
IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimal combinations of pattern classifiers
Pattern Recognition Letters
Machine Learning
Discovering Chinese words from unsegmented text (poster abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Effective foreign word extration for Korean information retrieval
Information Processing and Management: an International Journal
Stacking Bagged and Dagged Models
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting
EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Ensemble Methods in Machine Learning
MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Categorizing unknown words: using decision trees to identify names and misspellings
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Word identification for Mandarin Chinese sentences
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
The automatic extraction of open compounds from text corpora
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Unknown word extraction for Chinese documents
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Experiments with AdaBoost.RT, an improved boosting scheme for regression
Neural Computation
Using Correlation to Improve Boosting Technique: An Application for Time Series Forecasting
ICTAI '06 Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence
Adaptive Chinese word segmentation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Japanese unknown word identification by character-based chunking
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A collaborative framework for collecting Thai unknown words from the web
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Combining Bagging, Boosting and Dagging for Classification Problems
KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Stacked generalization: when does it work?
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Constructing diverse classifier ensembles using artificial training examples
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Towards a theoretical framework for ensemble classification
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Word segmentation standard in Chinese, Japanese and Korean
ALR7 Proceedings of the 7th Workshop on Asian Language Resources
ISTI@SemEval-2 task #8: Boosting-based multiway relation classification
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Boosting a multi-linear classifier with application to visual lip reading
Expert Systems with Applications: An International Journal
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Using unknown word techniques to learn known words
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Bagging model trees for classification problems
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Hi-index | 0.09 |
A boosting-based ensemble learning can be used to improve classification accuracy by using multiple classification models constructed to cope with errors obtained from their preceding steps. This paper proposes a method to improve boosting-based ensemble learning with penalty profiles via an application of automatic unknown word recognition in Thai language. Treating a sequential problem as a non-sequential problem, the unknown word recognition is required to include a process to rank a set of generated candidates for a potential unknown word position. To strengthen the recognition process with ensemble classification, the penalty profiles are defined to make it more efficient to construct a succeeding classification model which tends to re-rank a set of ranked candidates into a suitable order. As an evaluation, a number of alternative penalty profiles are introduced and their performances are compared for the task of extracting unknown words from a large Thai medical text. Using the Naive Bayes as the base classifier for ensemble learning, the proposed method with the best setting achieves an accuracy of 90.19%, which is an accuracy gap of 12.88, 10.59, and 6.05 over conventional Naive Bayes, non-ensemble version, and the flat-penalty profile.