Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Improving statistical language model performance with automatically generated word hierarchies
Computational Linguistics
Automatic Indexing: An Experimental Inquiry
Journal of the ACM (JACM)
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Feature Subset Selection in Text-Learning
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Intelligent Spider for Internet Searching
HICSS '97 Proceedings of the 30th Hawaii International Conference on System Sciences: Information Systems Track—Internet and the Digital Economy - Volume 4
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
User preference through learning user profile for ubiquitous recommendation systems
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Hi-index | 0.00 |
Previous Bayesian document classification has a problem because it does not reflect semantic relation accurately in expressing characteristic of document. In order to resolve this problem, this paper suggests Bayesian document classification method through mining and refining of association word. Apriori algorithm extracts characteristic of test document in form of association words that reflects semantic relation and it mines association words from learning documents. If association word from learning documents is mined only with Apriori algorithm, inappropriate association word is included within them. Accordingly it has disadvantage of lack of accuracy in document classification. In order to complement the disadvantage, we adopt method to refine association words through use of genetic algorithm. Naïve Bayes classifier classifies test documents based on refined association words.