Text chunking based on a generalization of winnow

Authors:
Tong Zhang;Fred Damerau;David Johnson
Affiliations:
T.J. Watson Research Center, Route 134, Yorktown Heights, NY;T.J. Watson Research Center, Route 134, Yorktown Heights, NY;T.J. Watson Research Center, Route 134, Yorktown Heights, NY
Venue:
The Journal of Machine Learning Research
Year:
2002

Citing 14
Cited 55

Some advances in transformation-based part of speech tagging

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
Linear hinge loss and average margin

Proceedings of the 1998 conference on Advances in neural information processing systems II
Linear Concepts and Hidden Variables

Machine Learning
On the Dual Formulation of Regularized Linear Systems with Convex Risks

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Relational Learning for NLP using Linear Threshold Elements

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Slot Grammar: A System for Simpler Construction of Practical Natural Language Grammars

Proceedings of the International Symposium on Natural Language and Logic
Text chunking using regularized Winnow

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Text chunking by system combination

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Chunking with WPDV models

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7

Introduction to special issue on machine learning approaches to shallow parsing

The Journal of Machine Learning Research
Focused named entity recognition using machine learning

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A system for automated mapping of bill-of-materials part numbers

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Text analytics for life science using the unstructured information management architecture

IBM Systems Journal
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Identifying and tracking entity mentions in a maximum entropy framework

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Filtering-Ranking Perceptron Learning for Partial Parsing

Machine Learning
Automated cleansing for spend analytics

Proceedings of the 14th ACM international conference on Information and knowledge management
Learning as search optimization: approximate large margin methods for structured prediction

ICML '05 Proceedings of the 22nd international conference on Machine learning
Updating an NLP system to fit new domains: an empirical study on the sentence segmentation problem

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition through classifier combination

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A robust risk minimization based named entity recognition system

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
HowtogetaChineseName(Entity): segmentation and combination issues

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Efficient inference on sequence segmentation models

ICML '06 Proceedings of the 23rd international conference on Machine learning
Exploiting unannotated corpora for tagging and chunking

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
A high-performance semi-supervised learning method for text chunking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Boosting-based parse reranking with subtree features

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Maximum entropy based restoration of Arabic diacritics

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Noun phrase chunking in Hebrew: influence of lexical and morphological features

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Semantic role labeling via integer linear programming inference

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Bidirectional inference with the easiest-first strategy for tagging sequence data

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Comparing and combining finite-state and context-free parsers

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A robust multilingual portable phrase chunking system

Expert Systems with Applications: An International Journal
Efficient text chunking using linear kernel with masked method

Knowledge-Based Systems
Improving discriminative sequential learning by discovering important association of statistics

ACM Transactions on Asian Language Information Processing (TALIP)
Minority vote: at-least-N voting improves recall for extracting relations

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Highly accurate error-driven method for noun phrase detection

Pattern Recognition Letters
Robust and efficient multiclass SVM models for phrase pattern recognition

Pattern Recognition
The importance of syntactic parsing and inference in semantic role labeling

Computational Linguistics
Sequence Labelling SVMs Trained in One Pass

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Arabic diacritic restoration approach based on maximum entropy models

Computer Speech and Language
Modeling latent-dynamic in shallow parsing: a latent conditional model with improved inference

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Empirical study on the performance stability of named entity recognition model across domains

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Domain adaptation with latent semantic association for named entity recognition

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Quadratic features and deep architectures for chunking

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
TimeML-compliant text analysis for temporal reasoning

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
The necessity of syntactic parsing for semantic role labeling

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Generalized inference with multiple semantic role labeling systems

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Effective use of TimeBank for TimeML analysis

Proceedings of the 2005 international conference on Annotating, extracting and reasoning about time and events
Joint training and decoding using virtual nodes for cascaded segmentation and tagging tasks

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Accelerated training of maximum margin Markov models for sequence labeling: a case study of NP chunking

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Domain customization for aspect-oriented opinion analysis with multi-level latent sentiment clues

Proceedings of the 20th ACM international conference on Information and knowledge management
Efficiently inducing features of conditional random fields

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Chinese named entity recognition based on multilevel linguistic features

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Efficient and robust phrase chunking using support vector machines

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
A general and multi-lingual phrase chunking model based on masking method

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
UCSG shallow parser

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Named entity recognition for web content filtering

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Mutual information independence model using kernel density estimation for segmenting and labeling sequential data

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Distributed english text chunking using multi-agent based architecture

MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
Voting between multiple data representations for text chunking

AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Syntactic chunking across different corpora

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Exploiting chunk-level features to improve phrase chunking

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Using wordnet hypernyms and dependency features for phrasal-level event recognition and type classification

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Turkish constituent chunking with morphological and contextual features

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a text chunking system based on a generalization of the Winnow algorithm. We propose a general statistical model for text chunking which we then convert into a classification problem. We argue that the Winnow family of algorithms is particularly suitable for solving classification problems arising from NLP applications, due to their robustness to irrelevant features. However in theory, Winnow may not converge for linearly non-separable data. To remedy this problem, we employ a generalization of the original Winnow method. An additional advantage of the new algorithm is that it provides reliable confidence estimates for its classification predictions. This property is required in our statistical modeling approach. We show that our system achieves state of the art performance in text chunking with less computational cost then previous systems.