Digital web library of a website with document clustering

Authors:
Isabel Mahecha-Nieto;Elizabeth León
Affiliations:
Universidad Nacional de Colombia, Departamento de Ingeniería de Sistemas e Industrial, Bogotá, Colombia;Universidad Nacional de Colombia, Departamento de Ingeniería de Sistemas e Industrial, Bogotá, Colombia
Venue:
IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence
Year:
2010

Citing 12
Cited 0

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Going digital: a look at assumptions underlying digital libraries

Communications of the ACM
Greenstone: a comprehensive open-source digital library software system

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Modern Information Retrieval

Modern Information Retrieval
MARIAN: Flexible Interoperability for Federated Digital Libraries

ECDL '01 Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries
OpenDLib: A Digital Library Service System

ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Text Mining in the SOMLib Digital Library System: The Representation of Topics and Genres

Applied Intelligence
Building Nutch: Open Source Search

Queue - Search Engines
Fedora: an architecture for complex objects and their relationships

International Journal on Digital Libraries
Understanding Digital Libraries, Second Edition (The Morgan Kaufmann Series in Multimedia and Information Systems)

Understanding Digital Libraries, Second Edition (The Morgan Kaufmann Series in Multimedia and Information Systems)
Bringing taxonomic structure to large digital libraries

International Journal of Metadata, Semantics and Ontologies
Ontology-based question answering for digital libraries

ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

Digital libraries allow organizing, classifying and publishing collections of electronic contents that are available in computers or networks. Also, digital libraries are easy to use and configure and they offer a user interface with access to fast searching and browsing over a repository of documents using a graphical interface. This article presents a digital library prototype for retrieving, indexing and clustering documents published on a website. The website may include unstructured, semi-structured and structured documents such as: web pages, scientific papers, news and documents in several formats that contain essentially text. The proposed prototype includes a clustering process that uses a conceptual algorithm and an a priori process of cluster labeling. Preliminary results correspond to tests made with different sets of documents published in a real website.