Machine-learning-based transformation of passive japanese sentences into active by separating training data into each input particle

Authors:
Masaki Murata;Tamotsu Shirado;Toshiyuki Kanamaru;Hitoshi Isahara
Affiliations:
National Institute of Information and Communications Technology, Kyoto, Japan;National Institute of Information and Communications Technology, Kyoto, Japan;National Institute of Information and Communications Technology, Kyoto, Japan;National Institute of Information and Communications Technology, Kyoto, Japan
Venue:
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Year:
2006

Citing 4
Cited 0

Feature selection in SVM text categorization

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Comparison of three machine-learning methods for Thai part-of-speech tagging

ACM Transactions on Asian Language Information Processing (TALIP)
Use of support vector learning for chunk identification

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7

Quantified Score

Hi-index	0.01

Visualization

Abstract

We developed a new method of transforming Japanese case particles when transforming Japanese passive sentences into active sentences. It separates training data into each input particle and uses machine learning for each particle. We also used numerous rich features for learning. Our method obtained a high rate of accuracy (94.30%). In contrast, a method that did not separate training data for any input particles obtained a lower rate of accuracy (92.00%). In addition, a method that did not have many rich features for learning used in a previous study (Murata and Isahara, 2003) obtained a much lower accuracy rate (89.77%). We confirmed that these improvements were significant through a statistical test. We also conducted experiments utilizing traditional methods using verb dictionaries and manually prepared heuristic rules and confirmed that our method obtained much higher accuracy rates than traditional methods.