Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Going digital: a look at assumptions underlying digital libraries
Communications of the ACM
Greenstone: a comprehensive open-source digital library software system
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Modern Information Retrieval
MARIAN: Flexible Interoperability for Federated Digital Libraries
ECDL '01 Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries
OpenDLib: A Digital Library Service System
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Building Nutch: Open Source Search
Queue - Search Engines
Fedora: an architecture for complex objects and their relationships
International Journal on Digital Libraries
Understanding Digital Libraries, Second Edition (The Morgan Kaufmann Series in Multimedia and Information Systems)
Bringing taxonomic structure to large digital libraries
International Journal of Metadata, Semantics and Ontologies
Ontology-based question answering for digital libraries
ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Hi-index | 0.00 |
Digital libraries allow organizing, classifying and publishing collections of electronic contents that are available in computers or networks. Also, digital libraries are easy to use and configure and they offer a user interface with access to fast searching and browsing over a repository of documents using a graphical interface. This article presents a digital library prototype for retrieving, indexing and clustering documents published on a website. The website may include unstructured, semi-structured and structured documents such as: web pages, scientific papers, news and documents in several formats that contain essentially text. The proposed prototype includes a clustering process that uses a conceptual algorithm and an a priori process of cluster labeling. Preliminary results correspond to tests made with different sets of documents published in a real website.