CiteSeer: an automatic citation indexing system
Proceedings of the third ACM conference on Digital libraries
Automatic classification of Web resources using Java and Dewey decimal classification
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Automatically indexing documents: content vs. reference
Proceedings of the 7th international conference on Intelligent user interfaces
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Predicting library of congress classifications from library of congress subject headings
Journal of the American Society for Information Science and Technology
A comparative study of two automatic document classification methods in a library setting
Journal of Information Science
Comparing citation contexts for information retrieval
Proceedings of the 17th ACM conference on Information and knowledge management
An extensive study on automated Dewey Decimal Classification
Journal of the American Society for Information Science and Technology
Document clustering of scientific texts using citation contexts
Information Retrieval
Using terms from citations for IR: some first results
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Leveraging the legacy of conventional libraries for organizing digital libraries
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
A citation-based approach to automatic topical indexing of scientific literature
Journal of Information Science
Combining contents and citations for scientific document classification
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Comparing and combining two approaches to automated subject classification of text
ECDL'06 Proceedings of the 10th European conference on Research and Advanced Technology for Digital Libraries
Hi-index | 0.00 |
This article describes an unsupervised approach for automatic classification of scientific literature archived in digital libraries and repositories according to a standard library classification scheme. The method is based on identifying all the references cited in the document to be classified and, using the subject classification metadata of extracted references as catalogued in existing conventional libraries, inferring the most probable class for the document itself with the help of a weighting mechanism. We have demonstrated the application of the proposed method and assessed its performance by developing a prototype software system for automatic classification of scientific documents according to the Dewey Decimal Classification scheme. A dataset of 1000 research articles, papers, and reports from a well-known scientific digital library, CiteSeer, were used to evaluate the classification performance of the system. Detailed results of this experiment are presented and discussed.