Visualization of a document collection: the vibe system
Information Processing and Management: an International Journal
WordNet: a lexical database for English
Communications of the ACM
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
A Concept-Driven Algorithm for Clustering Search Results
IEEE Intelligent Systems
Visual analytics: Storylines: Visual exploration and analysis in latent semantic spaces
Computers and Graphics
Promoting Insight-Based Evaluation of Visualizations: From Contest to Benchmark Repository
IEEE Transactions on Visualization and Computer Graphics
IVEA: an information visualization tool for personalized exploratory document collection analysis
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Exploiting corpus-related ontologies for conceptualizing document corpora
Journal of the American Society for Information Science and Technology
Journal of Biomedical Informatics
Hi-index | 0.00 |
In the age of increasing information availability, many techniques, such as document clustering and information visualization, have been developed to ease understanding of information for users. However, most of these methods do not help users directly understand key concepts and their semantic relationships in document corpora, which are critical for capturing their conceptual structures. Therefore, we propose a novel approach called 'Clonto' to identify the key concepts and automatically generate ontologies based on these concepts for conceptualization of document corpora. Clonto applies latent semantic analysis to identify key concepts, allocates documents based on these concepts, and utilizes WordNet to automatically generate a corpus-related ontology. The documents are linked to the ontology through the key concepts. The experimental results show that Clonto can identify key concepts with a high precision and the clustering results of Clonto outperform the STC (Suffix Tree Clustering) algorithm, the Lingo clustering algorithm, the Fuzzy Ants clustering algorithm, and clustering based on TRS (Tolerance Rough Set). Moreover, based on the same document corpus, the ontology generated by Clonto shows a significant informative conceptual structure.