Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search
IEEE Transactions on Knowledge and Data Engineering
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
The author-topic model for authors and documents
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
Bayesian hierarchical clustering
ICML '05 Proceedings of the 22nd international conference on Machine learning
Incremental hierarchical clustering of text documents
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Proceedings of the 17th international conference on World Wide Web
Topic modeling with network regularization
Proceedings of the 17th international conference on World Wide Web
Automatically refining the wikipedia infobox ontology
Proceedings of the 17th international conference on World Wide Web
Introduction to Information Retrieval
Introduction to Information Retrieval
Joint latent topic models for text and citations
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Mixed Membership Stochastic Blockmodels
The Journal of Machine Learning Research
Topic-link LDA: joint models of topic and author community
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Studying the history of ideas using topic models
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Topic and role discovery in social networks
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies
Journal of the ACM (JACM)
Context-aware citation recommendation
Proceedings of the 19th international conference on World wide web
SHRINK: a structural clustering algorithm for detecting hierarchical communities in networks
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Who should I cite: learning literature search models from citation behavior
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A hierarchical model of web summaries
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Document-topic hierarchies from document graphs
Proceedings of the 21st ACM international conference on Information and knowledge management
Search result presentation: supporting post-search navigation by integration of taxonomy data
Proceedings of the 22nd international conference on World Wide Web companion
Mining taxonomies from web menus: rule-based concepts and algorithms
ICWE'13 Proceedings of the 13th international conference on Web Engineering
Hi-index | 0.00 |
Hierarchical taxonomies provide a multi-level view of large document collections, allowing users to rapidly drill down to fine-grained distinctions in topics of interest. We show that automatically induced taxonomies can be made more robust by combining text with relational links. The underlying mechanism is a Bayesian generative model in which a latent hierarchical structure explains the observed data --- thus, finding hierarchical groups of documents with similar word distributions and dense network connections. As a nonparametric Bayesian model, our approach does not require pre-specification of the branching factor at each non-terminal, but finds the appropriate level of detail directly from the data. Unlike many prior latent space models of network structure, the complexity of our approach does not grow quadratically in the number of documents, enabling application to networks with more than ten thousand nodes. Experimental results on hypertext and citation network corpora demonstrate the advantages of our hierarchical, multimodal approach.