A Fast Algorithm for Hierarchical Text Classification

Authors:
Wesley T. Chuang;Asok Tiyyagura;Jihoon Yang;Giovanni Giuffrida
Affiliations:
-;-;-;-
Venue:
DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Year:
2000

Citing 10
Cited 3

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Information storage and retrieval

Information storage and retrieval
Learning to extract symbolic knowledge from the World Wide Web

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Deriving concept hierarchies from text

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning

Machine Learning
Text-Learning and Related Intelligent Agents: A Survey

IEEE Intelligent Systems
Using Taxonomy, Discriminants, and Signatures for Navigating in Text Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Using Machine Learning to Improve Information Access

Using Machine Learning to Improve Information Access
A machine learning approach to building domain-specific search engines

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2

Pyramidal Digest: An Efficient Model for Abstracting Text Databases

DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications
Class normalization in centroid-based text categorization

Information Sciences: an International Journal
A high performance centroid-based classification approach for language identification

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text classification is becoming more important with the proliferation of the Internet and the huge amount of data it transfers. We present an efficient algorithm for text classification using hierarchical classifiers based on a concept hierarchy. The simple TFIDF classifier is chosen to train sample data and to classify other new data. Despite its simplicity, results of experiments on Web pages and TV closed captions demonstrate high classification accuracy. Application of feature subset selection techniques improves the performance. Our algorithm is computationally efficient being bounded by O(n log n) for n samples.