An automatic email distribution by using text mining and reinforcement learning

  • Authors:
  • Yoshihiro Ueda;Hitoshi Narita;Naotaka Kato;Katsuaki Hayashi;Hidetaka Nambo;Haruhiko Kimura

  • Affiliations:
  • Industrial Research Institute of Ishikawa, Kanazawa, 920-8203 Japan;Faculty of Engineering, Kanazawa University, Kanazawa, 920-8667 Japan;Industrial Research Institute of Ishikawa, Kanazawa, 920-8203 Japan;Industrial Research Institute of Ishikawa, Kanazawa, 920-8203 Japan;Faculty of Engineering, Kanazawa University, Kanazawa, 920-8667 Japan;Faculty of Engineering, Kanazawa University, Kanazawa, 920-8667 Japan

  • Venue:
  • Systems and Computers in Japan
  • Year:
  • 2006

Quantified Score

Hi-index 0.02

Visualization

Abstract

The authors created a system to automatically distribute inquiry email from electronic mail systems or the Web to the appropriate supervisor. The proposed method gathers document data created by the supervisors, calculates the tf·idf value and the idf/conf value for the words that appear in the documents, and then creates two types of dictionaries for each supervisor. Moreover, the method features the use of Profit Sharing instead of conventional inductive learning for two weights of terms. Profit Sharing is one method of reinforcement learning. The system compares the inquiry email and the dictionaries, then calculates a score for each supervisor based on the word weights and match rates, and identifies a supervisor with a high score as the respondent. The authors performed evaluation experiments using real inquiry emails in order to evaluate the effectiveness of their method, and found the following. (1) Based on the distribution accuracy of specialists who distribute inquiry email, the accuracy necessary for practical use was obtained. (2) In distribution using only the tf·idf value and the idf/conf value, distribution accuracy sufficient for practical purposes was not obtained. (3) A practical level of distribution accuracy, roughly equivalent to that of the distribution specialists, was obtained through reinforcement learning of the word weights in (2). Finally, the authors evaluated the number of document files and noise necessary to obtain the practical level of accuracy in (3) and compared the accuracy in their method with that of a conventional text categorization method. © 2006 Wiley Periodicals, Inc. Syst Comp Jpn, 37(12): 82–95, 2006; Published online in Wiley InterScience (). DOI 10.1002/scj.20387