A machine learning parser using an unlexicalized distituent model

Authors:
Samuel W. K. Chan;Lawrence Y. L. Cheung;Mickey W. C. Chong
Affiliations:
Dept. of Decision Sciences, Chinese University of Hong Kong, Shatin, Hong Kong SAR;Dept. of Decision Sciences, Chinese University of Hong Kong, Shatin, Hong Kong SAR;Dept. of Decision Sciences, Chinese University of Hong Kong, Shatin, Hong Kong SAR
Venue:
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Year:
2010

Citing 27
Cited 1

Deducing linguistic structure from the statistics of large corpora

HLT '90 Proceedings of the workshop on Speech and Natural Language
Natural language parsing as statistical pattern recognition

Natural language parsing as statistical pattern recognition
Chinese text retrieval without using a dictionary

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Learning to Parse Natural Language with Maximum Entropy Models

Machine Learning - Special issue on natural language learning
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Partial parsing via finite-state cascades

Natural Language Engineering
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Statistical decision-tree models for parsing

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A maximum-entropy chinese parser augmented by transformation-based learning

ACM Transactions on Asian Language Information Processing (TALIP)
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

Natural Language Engineering
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Parsing, word associations and typical predicate-argument relations

HLT '89 Proceedings of the workshop on Speech and Natural Language
On the parameter space of generative lexicalized statistical parsing models

On the parameter space of generative lexicalized statistical parsing models
Head-Driven Statistical Models for Natural Language Parsing

Computational Linguistics
Using co-occurrence statistics as an information source for partial parsing of Chinese

CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
Online large-margin training of dependency parsers

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Deterministic dependency parsing of English text

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A best-first probabilistic shift-reduce parser

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Dependency Parsing

Dependency Parsing
A classifier-based parser with linear run-time complexity

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Chunk parsing revisited

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Parsing a natural language using mutual information statistics

AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 2

A text-based decision support system for financial sequence prediction

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite the popularity of lexicalized parsing models, practical concerns such as data sparseness and applicability to domains of different vocabularies make unlexicalized models that do not refer to word tokens themselves deserve more attention. A classifier-based parser using an unlexicalized parsing model has been developed. Most importantly, to enhance the accuracy of these tasks, we investigated the notion of distituency (the possibility that two parts of speech cannot remain in the same constituent or phrase) and incorporated it as attributes using various statistic measures. A machine learning method integrates linguistic attributes and information-theoretic attributes in two tasks, namely sentence chunking and phrase recognition. The parser was applied to parsing English and Chinese sentences in the Penn Treebank and the Tsinghua Chinese Treebank. It achieved a parsing performance of F-Score 80.3% in English and 82.4% in Chinese.