Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Rich interaction in the digital library
Communications of the ACM
Information foraging in information access environments
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
TileBars: visualization of term distribution information in full text information access
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering
Proceedings of the the seventh ACM conference on Hypertext
Scatter/gather browsing communicates the topic structure of a very large text collection
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Visualizing search results: some alternatives to query-document similarity
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Almost-constant-time clustering of arbitrary corpus subsets4
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Projections for efficient document clustering
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental clustering and dynamic information retrieval
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Exploring browser design trade-offs using a dynamical model of optimal information foraging
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Axis-specified search: a fine-grained full-text search method for gathering and structuring excerpts
Proceedings of the third ACM conference on Digital libraries
Static and dynamic information organization with star clusters
Proceedings of the seventh international conference on Information and knowledge management
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Mining Text Using Keyword Distributions
Journal of Intelligent Information Systems
Integrating content-based access mechanisms with hierarchical file systems
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Efficient algorithms for geometric optimization
ACM Computing Surveys (CSUR)
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
A practical clustering algorithm for static and dynamic information organization
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Exact and approximation algorithms for clustering
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the eighth international conference on Information and knowledge management
Flexible search functions for multimedia data with text and other auxiliary data
SAC '98 Proceedings of the 1998 ACM symposium on Applied Computing
Document clustering using word clusters via the information bottleneck method
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A semi-supervised document clustering technique for information organization
Proceedings of the ninth international conference on Information and knowledge management
Using star clusters for filtering
Proceedings of the ninth international conference on Information and knowledge management
Information retrieval on the web
ACM Computing Surveys (CSUR)
Evaluating document clustering for interactive information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Polynomial-time approximation schemes for geometric min-sum median clustering
Journal of the ACM (JACM)
Adaptive Filtering of Newswire Stories using Two-Level Clustering
Information Retrieval
Dynamic Taxonomies: A Model for Large Information Bases
IEEE Transactions on Knowledge and Data Engineering
Three-Tier Clustering: An Online Citation Clustering System
WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
Using Taxonomy, Discriminants, and Signatures for Navigating in Text Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Retrieving News Stories from a News Integration Archive
ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
An On-Line Document Clustering Method Based on Forgetting Factors
ECDL '01 Proceedings of the 5th European Conference on Research and Advanced Technology for Digital Libraries
Data Mining and Personalization Technologies
DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
Cooperative Information Retrieval Dialogues through Clustering
TDS '00 Proceedings of the Third International Workshop on Text, Speech and Dialogue
Generating, Visualizing, and Evaluating High-Quality Clusters for Information Organization
PODDP '98 Proceedings of the 4th International Workshop on Principles of Digital Document Processing
Memex: A Browsing Assistant for Collaborative Archiving and Mining of Surf Trails
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases
Data mining tasks and methods: Clustering: conceptual clustering
Handbook of data mining and knowledge discovery
Handbook of data mining and knowledge discovery
Data mining for hypertext: a tutorial survey
ACM SIGKDD Explorations Newsletter
Intelligent exploration of the web
On Using Partial Supervision for Text Categorization
IEEE Transactions on Knowledge and Data Engineering
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
Modeling content identification from document images
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Cross-lingual C*ST*RD: English access to Hindi information
ACM Transactions on Asian Language Information Processing (TALIP)
PageCluster: Mining conceptual link hierarchies from Web log files for adaptive Web site navigation
ACM Transactions on Internet Technology (TOIT)
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Using web structure and summarisation techniques for web content mining
Information Processing and Management: an International Journal
Multiple sets of features for automatic genre classification of web documents
Information Processing and Management: an International Journal
Generative semantic clustering in spatial hypertext
Proceedings of the 2005 ACM symposium on Document engineering
Towards scatter/gather browsing in a hierarchical peer-to-peer network
Proceedings of the 2005 ACM workshop on Information retrieval in peer-to-peer networks
Automatically labeling hierarchical clusters
dg.o '06 Proceedings of the 2006 international conference on Digital government research
An experimental study on automatically labeling hierarchical clusters using statistical features
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Summarizing local context to personalize global web search
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Search and browse services for heterogeneous collections with the peer-to-peer network Pepper
Information Processing and Management: an International Journal
A text retrieval package for the unix operating system
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
WebGlimpse: combining browsing and searching
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Identifying a hierarchy of bipartite subgraphs for web site abstraction
Web Intelligence and Agent Systems
A Clustering Algorithm Based on Generalized Stars
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Measuring the similarity between implicit semantic relations using web search engines
Proceedings of the Second ACM International Conference on Web Search and Data Mining
A new visual search interface for web browsing
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Document Clustering Description Extraction and Its Application
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Dynamicity vs. effectiveness: studying online clustering for scatter/gather
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Revealing collection structure through information access interfaces
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Exploiting internal and external semantics for the clustering of short texts using world knowledge
Proceedings of the 18th ACM conference on Information and knowledge management
Multiple sets of features for automatic genre classification of web documents
Information Processing and Management: an International Journal
Using Web structure and summarisation techniques for Web content mining
Information Processing and Management: an International Journal
Novel labeling strategies for hierarchical representation of multidimensional data analysis results
AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
PGR: portuguese attorney general's office decisions on the web
INAP'01 Proceedings of the Applications of prolog 14th international conference on Web knowledge management and decision support
Collective taxonomizing: A collaborative approach to organizing document repositories
Decision Support Systems
The role of queries in ranking labeled instances extracted from text
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
TopicNets: Visual Analysis of Large Text Corpora with Topic Modeling
ACM Transactions on Intelligent Systems and Technology (TIST)
TreeCluster: clustering results of keyword search over databases
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Graph-based navigation strategies for heterogeneous spatial data sets
GIScience'06 Proceedings of the 4th international conference on Geographic Information Science
Scatter/Gather browsing of web service QoS data
Future Generation Computer Systems
Interpretation and trust: designing model-driven visualizations for text analysis
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Clustering information retrieval search outputs
IRSG'99 Proceedings of the 21st Annual BCS-IRSG conference on Information Retrieval Research
Information vs interaction: examining different interaction models over consistent metadata
Proceedings of the 4th Information Interaction in Context Symposium
Mining subtopics from text fragments for a web query
Information Retrieval
Standing on the schemas of giants: socially augmented information foraging
Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
Hi-index | 0.02 |
The Scatter/Gather document browsing method uses fast document clustering to produce table-of-contents-like outlines of large document collections. Previous work [1] developed linear-time document clustering algorithms to establish the feasibility of this method over moderately large collections. However, even linear-time algorithms are too slow to support interactive browsing of very large collections such as Tipster, the DARPA standard text retrieval evaluation collection. We present a scheme that supports constant interaction-time Scatter/Gather of arbitrarily large collections after near-linear time preprocessing. This involves the construction of a cluster hierarchy. A modification of Scatter/Gather employing this scheme, and an example of its use over the Tipster collection are presented.