Interactive high-quality text classification

Authors:
Rey-Long Liu
Affiliations:
Department of Medical Informatics, Tzu Chi University, Hualien, Taiwan, ROC
Venue:
Information Processing and Management: an International Journal
Year:
2008

Citing 23
Cited 0

Combining classifiers in text categorization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Training algorithms for linear text classifiers

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Learning routing queries in a query zone

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Learning while filtering documents

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Relevance feedback with a small number of relevance judgements: incremental relevance feedback vs. document clustering

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Text filtering by boosting naive Bayes classifiers

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Active learning using adaptive resampling

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
A study of thresholding strategies for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Maximum likelihood estimation for filtering thresholds

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Bayesian online classifiers for text classification and filtering

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchically Classifying Documents Using Very Few Words

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Enhanced word clustering for hierarchical text classification

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A refinement approach to handling model misfit in text categorization

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Document classification through interactive supervision of document and term labels

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Adaptive sampling for thresholding in document filtering and classification

Information Processing and Management: an International Journal
Focused crawling for both topical relevance and quality of medical information

Proceedings of the 14th ACM international conference on Information and knowledge management
Dynamic category profiling for text filtering and classification

Information Processing and Management: an International Journal
Text classification for healthcare information support

IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic text classification (TC) is essential for information sharing and management. Its ideal goals are to achieve high-quality TC: (1) accepting almost all documents that should be accepted (i.e., high recall) and (2) rejecting almost all documents that should be rejected (i.e., high precision). Unfortunately, the ideal goals are rarely achieved, making automatic TC not suitable for those applications in which a classifier's erroneous decision may incur high cost and/or serious problems. One way to pursue the ideal is to consult users to confirm the classifier's decisions so that potential errors may be corrected. However, its main challenge lies on the control of the number of confirmations, which may incur heavy cognitive load on the users. We thus develop an intelligent and classifier-independent confirmation strategy ICCOM. Empirical evaluation shows that ICCOM may help various kinds of classifiers to achieve very high precision and recall by conducting fewer confirmations. The contributions are significant to the archiving and recommendation of critical information, since identification of possible TC errors (those that require confirmation) is the key to process information more properly.