C4.5: programs for machine learning
C4.5: programs for machine learning
Bagging and boosting a treebank parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Using decision trees to construct a practical parser
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Automatic corpus-based Thai word extraction with the c4.5 learning algorithm
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Chunking with support vector machines
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Japanese dependency structure analysis based on support vector machines
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Boosting trees for clause splitting
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
The longer the input sentences, the worse the syntactic parsing results. Therefore, a long sentence is first divided into several clauses, and syntactic analysis for each clause is performed. Finally, all the analysis results are merged into one. In the merging process, it is difficult to determine the dependency among clauses. To handle such syntactic ambiguity among clauses, this paper proposes two-step clause-dependency determination method based on machine learning techniques. We extract various clause-specific features, and analyze the effect of each feature on the performance. For the Korean texts, we experiment using four kinds of machine-learning methods. Logitboosting method performed best and it also outperformed the previous rule-based methods.