High Relevance Keyword Extraction facility for Bayesian text classification on different domains of varying characteristic

Authors:
Lam Hong Lee;Dino Isa;Wou Onn Choo;Wen Yeen Chue
Affiliations:
Faculty of Information and Communication Technology - Perak Campus, Universiti Tunku Abdul Rahman, Bandar Barat, 31900 Kampar, Perak, Malaysia;Intelligent Systems Research Group, Faculty of Engineering, The University of Nottingham, Malaysia Campus, Jalan Broga, 43500 Semenyih, Selangor, Malaysia;Faculty of Information and Communication Technology - Perak Campus, Universiti Tunku Abdul Rahman, Bandar Barat, 31900 Kampar, Perak, Malaysia;Faculty of Business and Finance, Universiti Tunku Abdul Rahman, Bandar Barat, 31900 Kampar, Perak, Malaysia
Venue:
Expert Systems with Applications: An International Journal
Year:
2012

Citing 15
Cited 5

Automated learning of decision rules for text categorization

ACM Transactions on Information Systems (TOIS)
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Making large-scale support vector machine learning practical

Advances in kernel methods
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Effective Methods for Improving Naive Bayes Text Classifiers

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Enhanced word clustering for hierarchical text classification

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Spam filters: bayes vs. chi-squared; letters vs. words

ISICT '03 Proceedings of the 1st international symposium on Information and communication technologies
Clustering documents into a web directory for bootstrapping a supervised classification

Data & Knowledge Engineering - Special issue: WIDM 2003
A New Text Categorization Technique Using Distributional Clustering and Learning Logic

IEEE Transactions on Knowledge and Data Engineering
Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine

IEEE Transactions on Knowledge and Data Engineering
Feature selection for text classification with Naïve Bayes

Expert Systems with Applications: An International Journal
Using the self organizing map for clustering of text documents

Expert Systems with Applications: An International Journal

A hybrid text classification approach with low dependency on parameter by integrating K-nearest neighbor and support vector machine

Expert Systems with Applications: An International Journal
Using Wikipedia concepts and frequency in language to extract key terms from support documents

Expert Systems with Applications: An International Journal
Research in keyword extraction

WISM'12 Proceedings of the 2012 international conference on Web Information Systems and Mining
The decomposed k-nearest neighbor algorithm for imbalanced text classification

FGIT'12 Proceedings of the 4th international conference on Future Generation Information Technology
An adaptation algorithm for an intelligent natural language tutoring system

Computers & Education

Quantified Score

Hi-index	12.05

Visualization

Abstract

High Relevance Keyword Extraction (HRKE) facility is introduced to Bayesian text classification to perform feature/keyword extraction during the classifying stage, without needing extensive pre-classification processes. In order to perform the task of keyword extraction, HRKE facility uses the posterior probability value of keywords within a specific category associated with text document. The experimental results show that HRKE facility is able to ensure promising classification performance for Bayesian classifier while dealing with different text classification domains of varying characteristics. This method guarantees an effective and efficient Bayesian text classifier which is able to handle different domains of varying characteristics, with high accuracy while maintaining the simplicity and low cost processes of the conventional Bayesian classification approach.