Proceedings of the 2007 ACM symposium on Document engineering
Is a Voting Approach Accurate for Opinion Mining?
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
An Autonomous System Designed for Automatic Detection and Rating of Film Reviews
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Data mining of maps and their automatic region-time-theme classification
SIGSPATIAL Special
Sentiment analysis of Chinese documents: From sentence to document level
Journal of the American Society for Information Science and Technology
A bayesian approach to classify conference papers
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Removing smoothing from naive bayes text classifier
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
CatStream: categorising tweets for user profiling and stream filtering
Proceedings of the 2013 international conference on Intelligent user interfaces
Temporal and multi-versioned XML documents: A survey
Information Processing and Management: an International Journal
Hi-index | 0.00 |
This paper presents an automatic document classification system, WebDoc, which classifies Web documents according to the Library of Congress classification scheme. WebDoc constructs a knowledge base from the training data and then classifies the documents based on information in the knowledge base. One of the classification algorithms used in WebDoc is based on Bayes' theorem from probability theory. This paper focuses upon three aspects of this approach: different event models for the naive Bayes method, different probability smoothing methods, and different feature selection methods. In this paper, we report theperformance of each method in terms of recall, precision, and F-measures. Experimental results show that the WebDoc system can classify Web documents effectively and efficiently.