Using decision trees to construct a practical parser

Authors:
Masahiko Haruno;Satoshi Shirai;Yoshifumi Ooyama
Affiliations:
ATR Human Information Processing Research Laboratories, Kyoto, Japan;NTT Communication Science Laboratories, Kyoto, Japan;NTT Communication Science Laboratories, Kyoto, Japan
Venue:
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Year:
1998

Citing 6
Cited 17

C4.5: programs for machine learning

C4.5: programs for machine learning
Statistical Language Learning

Statistical Language Learning
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Statistical decision-tree models for parsing

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Joint knowledge capture for grammars and ontologies

Proceedings of the 1st international conference on Knowledge capture
Bagging and boosting a treebank parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Analyzing dependencies of Japanese subordinate clauses based on statistics of scope embedding preference

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Japanese dependency structure analysis based on maximum entropy models

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A hybrid Japanese parser with hand-crafted grammar and statistics

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
A stochastic parser based on a structural word prediction model

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Backward beam search algorithm for dependency analysis of Japanese

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Japanese dependency analysis using a deterministic finite state transducer

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A stochastic parser based on an SLM with arboreal context trees

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Japanese dependency structure analysis based on support vector machines

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Feature selection for a rich HPSG grammar using decision trees

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Linear-time dependency analysis for Japanese

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Dependency structure analysis and sentence boundary detection in spontaneous Japanese

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Determining the Dependency Among Clauses Based on Machine Learning Techniques

ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
Comparison of various machine learning-based classifications of relative clauses

ACS'06 Proceedings of the 6th WSEAS international conference on Applied computer science
Syntactic analysis of long sentences based on s-clauses

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Machine learning for query formulation in question answering

Natural Language Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes novel and practical Japanese parsers that uses decision trees. First, we construct a single decision tree to estimate modification probabilities; how one phrase tends to modify another. Next, we introduce a boosting algorithm in which several decision trees are constructed and then combined for probability estimation. The two constructed parsers are evaluated by using the EDR Japanese annotated corpus. The single-tree method outperforms the conventional Japanese stochastic methods by 4%. Moreover, the boosting version is shown to have significant advantages; 1) better parsing accuracy than its single-tree counterpart for any amount of training data and 2) no over-fitting to data for various iterations.