Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Document clustering with cluster refinement and model selection capabilities
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Thesauri and ontologies in digital libraries
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Subject metadata enrichment using statistical topic models
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Organizing the OCA: learning faceted subjects from a library of digital books
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
An evaluation scheme for hierarchical information browsing structures
CHI '08 Extended Abstracts on Human Factors in Computing Systems
International Journal of Approximate Reasoning
Hierarchical classification of OAI metadata using the DDC taxonomy
NLP4DL'09/AT4DL'09 Proceedings of the 2009 international conference on Advanced language technologies for digital libraries
Beyond digital incunabula: modeling the next generation of digital libraries
ECDL'06 Proceedings of the 10th European conference on Research and Advanced Technology for Digital Libraries
Hi-index | 0.00 |
In this article we present an evaluation of text clustering and classification methods for creating digital library browse interfaces, focusing on the particular case of collections made up of heterogeneous metadata records. This situation is common in "portal" style digital libraries, which are built by harvesting content from many disparate sources, typically using the Open Archives Protocol for Metadata Harvesting (OAI-PMH). By studying the activity of users in an experimental system, we find that taxonomies built or populated using machine-learning (or "AI") techniques provide a potentially useful avenue for browsing in this digital library scenario.