Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
The Perceptron Algorithm with Uneven Margins
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
A maximum entropy approach to information extraction from semi-structured and free text
Eighteenth national conference on Artificial intelligence
Relational learning techniques for natural language information extraction
Relational learning techniques for natural language information extraction
Machine learning for information extraction in informal domains
Machine learning for information extraction in informal domains
Efficient support vector classifiers for named entity recognition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Evaluating machine learning for information extraction
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning a perceptron-based named entity chunker via online recognition feedback
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition through classifier combination
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition using hundreds of thousands of features
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
SVM based learning system for information extraction
Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning
Hierarchical, perceptron-like learning for ontology-based information extraction
Proceedings of the 16th international conference on World Wide Web
An Improved Support Vector Machine for the Classification of Imbalanced Biological Datasets
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
Academic conference homepage understanding using constrained hierarchical conditional random fields
Proceedings of the 17th ACM conference on Information and knowledge management
Adapting svm for data sparseness and imbalance: A case study in information extraction
Natural Language Engineering
Segmentation of legal documents
Proceedings of the 12th International Conference on Artificial Intelligence and Law
Ontology-based information extraction: An introduction and a survey of current approaches
Journal of Information Science
Peeling back the layers: detecting event role fillers in secondary contexts
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
PSI'09 Proceedings of the 7th international Andrei Ershov Memorial conference on Perspectives of Systems Informatics
Twitter, MySpace, Digg: Unsupervised Sentiment Analysis in Social Media
ACM Transactions on Intelligent Systems and Technology (TIST)
Bootstrapped training of event extraction classifiers
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
A hybrid system for Spanish text simplification
SLPAT '12 Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies
Hi-index | 0.00 |
The classification problem derived from information extraction (IE) has an imbalanced training set. This is particularly true when learning from smaller datasets which often have a few positive training examples and many negative ones. This paper takes two popular IE algorithms -- SVM and Perceptron -- and demonstrates how the introduction of an uneven margins parameter can improve the results on imbalanced training data in IE. Our experiments demonstrate that the uneven margin was indeed helpful, especially when learning from few examples. Essentially, the smaller the training set is, the more beneficial the uneven margin can be. We also compare our systems to other state-of-the-art algorithms on several benchmarking corpora for IE.