Improving Term Extraction by System Combination Using Boosting

Authors:
Jordi Vivaldi;Lluís Màrquez;Horacio Rodríguez
Affiliations:
-;-;-
Venue:
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Year:
2001

Citing 8
Cited 2

Original Contribution: Stacked generalization

Neural Networks
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
EuroWordNet: a multilingual database with lexical semantic networks

EuroWordNet: a multilingual database with lexical semantic networks
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Boosting Applied toe Word Sense Disambiguation

ECML '00 Proceedings of the 11th European Conference on Machine Learning
A methodology for automatic term recognition

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Boosting trees for clause splitting

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7

The REG summarization system with question reformulation at QA@INEX track 2010

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
An unsupervised cascade learning scheme for 'cluster-theme keywords' structure extraction from scientific papers

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Term extraction is the task of automatically detecting, from textual corpora, lexical units that designate concepts in thematically restricted domains (e.g. medicine). Current systems for term extraction integrate linguistic and statistical cues to perform the detection of terms. The best results have been obtained when some kind of combination of simple base term extractors is performed [14]. In this paper it is shown that this combination can be further improved by posing an additional learning problem of how to find the best combination of base term extractors. Empirical results, using AdaBoost in the metalearning step, show that the ensemble constructed surpasses the performance of all individual extractors and simple voting schemes, obtaining significantly better accuracy figures at all levels of recall.