Word association norms, mutual information, and lexicography
Computational Linguistics
Scatter/gather browsing communicates the topic structure of a very large text collection
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Bringing order to the Web: automatically categorizing search results
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Optimizing search by showing results in context
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Information Retrieval
The Journal of Machine Learning Research
Information diffusion through blogspace
Proceedings of the 13th international conference on World Wide Web
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A cross-collection mixture model for comparative text mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Opinion observer: analyzing and comparing opinions on the Web
WWW '05 Proceedings of the 14th international conference on World Wide Web
The predictive power of online chatter
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A mixture model for contextual text mining
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploiting domain structure for named entity recognition
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Topic sentiment mixture: modeling facets and opinions in weblogs
Proceedings of the 16th international conference on World Wide Web
Learn from web search logs to organize search results
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Generating gene summaries from biomedical literature: A study of semi-structured summarization
Information Processing and Management: an International Journal
Topic modeling with network regularization
Proceedings of the 17th international conference on World Wide Web
Opinion integration through semi-supervised topic modeling
Proceedings of the 17th international conference on World Wide Web
Proceedings of the 18th international conference on World wide web
Mining positive and negative patterns for relevance feature discovery
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Selected new training documents to update user profile
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Aspect-based extractive summarization of online reviews
Proceedings of the 2011 ACM Symposium on Applied Computing
Latent topic feedback for information retrieval
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-aspect query summarization by composite query
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Journal of Web Engineering
Future Generation Computer Systems
Hi-index | 0.00 |
A common task in many text mining applications is to generate a multi-faceted overview of a topic in a text collection. Such an overview not only directly serves as an informative summary of the topic, but also provides a detailed view of navigation to different facets of the topic. Existing work has cast this problem as a categorization problem and requires training examples for each facet. This has three limitations: (1) All facets are predefined, which may not fit the need of a particular user. (2) Training examples for each facet are often unavailable. (3) Such an approach only works for a predefined type of topics. In this paper, we break these limitations and study a more realistic new setup of the problem, in which we would allow a user to flexibly describe each facet with keywords for an arbitrary topic and attempt to mine a multi-faceted overview in an unsupervised way. We attempt a probabilistic approach to solve this problem. Empirical experiments on different genres of text data show that our approach can effectively generate a multi-faceted overview for arbitrary topics; the generated overviews are comparable with those generated by supervised methods with training examples. They are also more informative than unstructured flat summaries. The method is quite general, thus can be applied to multiple text mining tasks in different application domains.