An evaluation of phrasal and clustered representations on a text categorization task
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
The automatic decomposition of DDC synthesized numbers
The automatic decomposition of DDC synthesized numbers
A comparison of classifiers and document representations for the routing problem
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating Dewey concepts as a knowledge base for automatic subject assignment
DL '97 Proceedings of the second ACM international conference on Digital libraries
Automatic classification of Web resources using Java and Dewey decimal classification
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Multicategory Classification by Support Vector Machines
Computational Optimization and Applications - Special issue on computational optimization—a tribute to Olvi Mangasarian, part I
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Towards an effective cooperation of the user and the computer for classification
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Exploiting Hierarchy in Text Categorization
Information Retrieval
Exploiting hierarchical domain structure to compute similarity
ACM Transactions on Information Systems (TOIS)
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hierarchical Text Classification and Evaluation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Interactive Classification of Web Documents by Self-Organizing Maps and Search Engines
DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Building a meaningful Web: from traditional knowledge organization systems to new semantic tools
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
A scalability analysis of classifiers in text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Predicting library of congress classifications from library of congress subject headings
Journal of the American Society for Information Science and Technology
A graphical user interface for a fine-art painting image retrieval system
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Hierarchical document categorization with support vector machines
Proceedings of the thirteenth ACM international conference on Information and knowledge management
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Towards exploratory test instance specific algorithms for high dimensional classification
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
On the use of Human-Computer Interaction for Projected Nearest Neighbor Search
Data Mining and Knowledge Discovery
Classifying web documents in a hierarchy of categories: a comprehensive study
Journal of Intelligent Information Systems
Interactive classification using a granule network
ICCI '05 Proceedings of the Fourth IEEE International Conference on Cognitive Informatics
Text classification from unlabeled documents with bootstrapping and feature projection techniques
Information Processing and Management: an International Journal
Hierarchical classification of OAI metadata using the DDC taxonomy
NLP4DL'09/AT4DL'09 Proceedings of the 2009 international conference on Advanced language technologies for digital libraries
Journal of Information Science
Hi-index | 0.00 |
In this paper, we present a theoretical analysis and extensive experiments on the automated assignment of Dewey Decimal Classification (DDC) classes to bibliographic data with a supervised machine-learning approach. Library classification systems, such as the DDC, impose great obstacles on state-of-art text categorization (TC) technologies, including deep hierarchy, data sparseness, and skewed distribution. We first analyze statistically the document and category distributions over the DDC, and discuss the obstacles imposed by bibliographic corpora and library classification schemes on TC technology. To overcome these obstacles, we propose an innovative algorithm to reshape the DDC structure into a balanced virtual tree by balancing the category distribution and flattening the hierarchy. To improve the classification effectiveness to a level acceptable to real-world applications, we propose an interactive classification model that is able to predict a class of any depth within a limited number of user interactions. The experiments are conducted on a large bibliographic collection created by the Library of Congress within the science and technology domains over 10 years. With no more than three interactions, a classification accuracy of nearly 90% is achieved, thus providing a practical solution to the automatic bibliographic classification problem. © 2009 Wiley Periodicals, Inc.