Evaluating text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
The nature of statistical learning theory
The nature of statistical learning theory
Matrix computations (3rd ed.)
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Co-clustering documents and words using bipartite spectral graph partitioning
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Bipartite graph partitioning and data clustering
Proceedings of the tenth international conference on Information and knowledge management
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Modern Information Retrieval
A Min-max Cut Algorithm for Graph Partitioning and Data Clustering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Scaling multi-class support vector machines using inter-class confusion
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A scalability analysis of classifiers in text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
ReCoM: reinforcement clustering of multi-type interrelated data objects
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
A hierarchical method for multi-class support vector machines
ICML '04 Proceedings of the twenty-first international conference on Machine learning
An experimental study on large-scale web categorization
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Web image clustering by consistent utilization of visual features and surrounding texts
Proceedings of the 13th annual ACM international conference on Multimedia
Data & Knowledge Engineering
Discovering relationships among categories using misclassification information
Proceedings of the 2008 ACM symposium on Applied computing
Semi-supervised Document Clustering with Simultaneous Text Representation and Categorization
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Mining temporal relationships among categories
Proceedings of the 2010 ACM Symposium on Applied Computing
Detecting relationships among categories using text classification
Journal of the American Society for Information Science and Technology
Towards bipartite graph data management
CloudDB '10 Proceedings of the second international workshop on Cloud data management
Concept hierarchy construction by combining spectral clustering and subsumption estimation
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Hi-index | 0.00 |
Multiclass classification has been investigated for many years in the literature. Recently, the scales of real-world multiclass classification applications have become larger and larger. For example, there are hundreds of thousands of categories employed in the Open Directory Project (ODP) and the Yahoo! directory. In such cases, the scalability of classification methods turns out to be a major concern. To tackle this problem, hierarchical classification is proposed and widely adopted to get better trade-off between effectiveness and efficiency. Unfortunately, many data sets are not explicitly organized in hierarchical forms and, therefore, hierarchical classification cannot be used directly. In this paper, we propose a novel algorithm to automatically mine a hierarchical structure from the flat taxonomy of a data corpus as a preparation for the adoption of hierarchical classification. In particular, we first compute matrices to represent the relations among categories, documents, and terms. And, then, we cocluster the three substances at different scales through consistent bipartite spectral graph copartitioning, which is formulated as a generalized singular value decomposition problem. At last, a hierarchical taxonomy is constructed from the category clusters. Our experiments showed that the proposed algorithm could discover very reasonable taxonomy hierarchy and help improve the classification accuracy.