The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The term vector database: fast access to indexing terms for Web pages
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
BlogScope: a system for online analysis of high volume text streams
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Introduction to Information Retrieval
Introduction to Information Retrieval
Text Visualization for Visual Text Analytics
Visual Data Mining
Top_Keyword: An Aggregation Function for Textual Document OLAP
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Text Cube: Computing IR Measures for Multidimensional Text Database Analysis
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Interactive, topic-based visual text summarization and analysis
Proceedings of the 18th ACM conference on Information and knowledge management
Topic modeling for OLAP on multidimensional text databases: topic cube and its applications
Statistical Analysis and Data Mining - Best of SDM'09
TIARA: a visual exploratory text analytic system
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Visual cube and on-line analytical processing of images
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
When faced with a document collection of substantial size, it is difficult for users to explore and analyze the information contained in it. Tagging has been used to improve the organization of documents in a collection, but it has various limitations. We propose to improve the analysis and exploration of tagged document collections by organizing the documents into clusters and allowing users to perform online analytical processing on the clusters. However, supporting OLAP on clusters of documents poses various challenges that need to be addressed. These challenges include providing efficient representations for cluster centroids and document positions inside the clusters, dealing with overlapping clusters, efficient and accurate aggregation of clusters, providing functionality for helping users find representative documents for a cluster, and determining the strength of relationship between clusters.