Fast Text Classification Using Sequential Sampling Processes

Authors:
Michael D. Lee
Affiliations:
-
Venue:
AI '01 Proceedings of the 14th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Year:
2001

Citing 3
Cited 0

A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Stochastic dynamic models of response time and accuracy: a foundational primer

Journal of Mathematical Psychology
Information Retrieval

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

A central problem in information retrieval is the automated classification of text documents. While many existing methods achieve good levels of performance, they generally require levels of computation that prevent them from making sufficiently fast decisions in some applied setting. Using insights gained from examining the way humans make fast decisions when classifying text documents, two new text classification algorithms are developed based on sequential sampling processes. These algorithms make extremely fast decisions, because they need to examine only a small number of words in each text document. Evaluation against the Reuters-21578 collection shows both techniques have levels of performance that approach benchmark methods, and the ability of one of the classifiers to produce realistic measures of confidence in its decisions is shown to be useful for prioritizing relevant documents.