Generating a set of rules to determine the gender of a speaker of a Japanese sentence

Authors:
Kanako Komiya;Chikara Igarashi;Kazutomo Shibahara;Koji Fujimoto;Yasuhiro Tajima;Yoshiyuki Kotani
Affiliations:
Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan;Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan;Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan;Tensor Consulting Co. Ltd., ChiyodaKu, Tokyo, Japan;Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan;Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan
Venue:
WSEAS TRANSACTIONS on COMMUNICATIONS
Year:
2009

Citing 4
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Gender-Preferential Text Mining of E-mail Discourse

ACSAC '02 Proceedings of the 18th Annual Computer Security Applications Conference
End user friendly data mining with decision trees: a reality or a wish?

CEA'07 Proceedings of the 2007 annual Conference on International Conference on Computer Engineering and Applications
Text classification: a recent overview

ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Some work has been reported on the problem of automatically determining the gender of a document's author as a part of researches to extract features of a document's author. Japanese language has expressions called masculine/feminine expression, and they can often indicate the gender of a speaker of a conversational sentence. The computer system needs this mechanism in order to make or understand natural Japanese conversational sentences. The authors made a system that determines the suitable gender of a speaker of a single conversational sentence and named it gender-determining system (GDS). It generates a set of rules to determine the more suitable gender of a speaker of a sentence automatically, by decision tree learning. The authors employed six linguistic features for each of two morphemes at the end of a sentence and presence or absence of morphemes whose part of speech is a miscellaneous pronoun or a particle for ending as features of decision tree learning. The authors calculated the accuracy of GDS using the cross validation method and it was approximately 69.3% when human could answer the same problem with approximately 71.7%. The authors showed decision tree learning is more suitable than multiple regression analysis or Bayesian estimation in order to classify the gender of the speaker of Japanese sentences and generate a set of rules to determine them, and selected the suitable features as the inputs of GDS. The set of rules GDS generated indicates, for example, women speak more politely than men in Japan.