Using uneven margins SVM and perceptron for information extraction

Authors:
Yaoyong Li;Kalina Bontcheva;Hamish Cunningham
Affiliations:
The University of Sheffield, Sheffield, UK;The University of Sheffield, Sheffield, UK;The University of Sheffield, Sheffield, UK
Venue:
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Year:
2005

Citing 12
Cited 11

Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
The Perceptron Algorithm with Uneven Margins

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Boosted Wrapper Induction

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
A maximum entropy approach to information extraction from semi-structured and free text

Eighteenth national conference on Artificial intelligence
Relational learning techniques for natural language information extraction

Relational learning techniques for natural language information extraction
Machine learning for information extraction in informal domains

Machine learning for information extraction in informal domains
Efficient support vector classifiers for named entity recognition

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Evaluating machine learning for information extraction

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning a perceptron-based named entity chunker via online recognition feedback

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition through classifier combination

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition using hundreds of thousands of features

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
SVM based learning system for information extraction

Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning

Hierarchical, perceptron-like learning for ontology-based information extraction

Proceedings of the 16th international conference on World Wide Web
An Improved Support Vector Machine for the Classification of Imbalanced Biological Datasets

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
Academic conference homepage understanding using constrained hierarchical conditional random fields

Proceedings of the 17th ACM conference on Information and knowledge management
Adapting svm for data sparseness and imbalance: A case study in information extraction

Natural Language Engineering
Segmentation of legal documents

Proceedings of the 12th International Conference on Artificial Intelligence and Law
Ontology-based information extraction: An introduction and a survey of current approaches

Journal of Information Science
Peeling back the layers: detecting event role fillers in secondary contexts

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
SVM paradoxes

PSI'09 Proceedings of the 7th international Andrei Ershov Memorial conference on Perspectives of Systems Informatics
Twitter, MySpace, Digg: Unsupervised Sentiment Analysis in Social Media

ACM Transactions on Intelligent Systems and Technology (TIST)
Bootstrapped training of event extraction classifiers

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
A hybrid system for Spanish text simplification

SLPAT '12 Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

The classification problem derived from information extraction (IE) has an imbalanced training set. This is particularly true when learning from smaller datasets which often have a few positive training examples and many negative ones. This paper takes two popular IE algorithms -- SVM and Perceptron -- and demonstrates how the introduction of an uneven margins parameter can improve the results on imbalanced training data in IE. Our experiments demonstrate that the uneven margin was indeed helpful, especially when learning from few examples. Essentially, the smaller the training set is, the more beneficial the uneven margin can be. We also compare our systems to other state-of-the-art algorithms on several benchmarking corpora for IE.