Bayesian web document classification through optimizing association word

  • Authors:
  • Su Jeong Ko;Jun Hyeog Choi;Jung Hyun Lee

  • Affiliations:
  • School of Computer Science & Engineering, Inha University Yong_hyen dong, Namgu, Inchon, Korea;Division of Computer Science, Kimpo College, Kimpo, Kyonggi-do, Korea;School of Computer Science & Engineering, Inha University Yong_hyen dong, Namgu, Inchon, Korea

  • Venue:
  • IEA/AIE'2003 Proceedings of the 16th international conference on Developments in applied artificial intelligence
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previous Bayesian document classification has a problem because it does not reflect semantic relation accurately in expressing characteristic of document. In order to resolve this problem, this paper suggests Bayesian document classification method through mining and refining of association word. Apriori algorithm extracts characteristic of test document in form of association words that reflects semantic relation and it mines association words from learning documents. If association word from learning documents is mined only with Apriori algorithm, inappropriate association word is included within them. Accordingly it has disadvantage of lack of accuracy in document classification. In order to complement the disadvantage, we adopt method to refine association words through use of genetic algorithm. Naïve Bayes classifier classifies test documents based on refined association words.