Generating a set of rules to determine the gender of a speaker of a Japanese sentence

  • Authors:
  • Kanako Komiya;Chikara Igarashi;Kazutomo Shibahara;Koji Fujimoto;Yasuhiro Tajima;Yoshiyuki Kotani

  • Affiliations:
  • Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan;Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan;Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan;Tensor Consulting Co. Ltd., ChiyodaKu, Tokyo, Japan;Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan;Department of Computer, Information and Communication Sciences, Tokyo University of Agriculture and Technology, Koganei, Tokyo, Japan

  • Venue:
  • WSEAS TRANSACTIONS on COMMUNICATIONS
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Some work has been reported on the problem of automatically determining the gender of a document's author as a part of researches to extract features of a document's author. Japanese language has expressions called masculine/feminine expression, and they can often indicate the gender of a speaker of a conversational sentence. The computer system needs this mechanism in order to make or understand natural Japanese conversational sentences. The authors made a system that determines the suitable gender of a speaker of a single conversational sentence and named it gender-determining system (GDS). It generates a set of rules to determine the more suitable gender of a speaker of a sentence automatically, by decision tree learning. The authors employed six linguistic features for each of two morphemes at the end of a sentence and presence or absence of morphemes whose part of speech is a miscellaneous pronoun or a particle for ending as features of decision tree learning. The authors calculated the accuracy of GDS using the cross validation method and it was approximately 69.3% when human could answer the same problem with approximately 71.7%. The authors showed decision tree learning is more suitable than multiple regression analysis or Bayesian estimation in order to classify the gender of the speaker of Japanese sentences and generate a set of rules to determine them, and selected the suitable features as the inputs of GDS. The set of rules GDS generated indicates, for example, women speak more politely than men in Japan.