Hierarchical indexing and document matching in BoW
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Collection statistics for fast duplicate document detection
ACM Transactions on Information Systems (TOIS)
Modern Information Retrieval
Similarity Model and Term Association For Document Categorization
DEXA '02 Proceedings of the 13th International Workshop on Database and Expert Systems Applications
Information Navigation by Clustering and Summarizing Query Results
HICSS '00 Proceedings of the 33rd Hawaii International Conference on System Sciences-Volume 3 - Volume 3
Syntactic Similarity of Web Documents
LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
THESUS: Organizing Web document collections based on link semantics
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient Phrase-Based Document Indexing for Web Document Clustering
IEEE Transactions on Knowledge and Data Engineering
Correlating summarization of multi-source news with k-way graph bi-clustering
ACM SIGKDD Explorations Newsletter
Proceedings of the 43rd annual Southeast regional conference - Volume 1
Web searching on the Vivisimo search engine
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
Total or partial duplication of documents affects the effectiveness of the visualization of search results. In this paper we propose a navigation strategy that sorts a list of documents such that the first documents contain more information content decreasing considerably duplication. The strategy defines a content relation between documents based on their equivalence and omission and estimates the new information content obtained from visiting documents. In this paper, we describe the strategy and experimentally evaluate it. These results indicate the potential use of this strategy for the visualization of thematically related documents that are relevant to a query.