Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Towards multidocument summarization by reformulation: progress and prospects
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The Journal of Machine Learning Research
Coupled clustering: a method for detecting structural correspondence
The Journal of Machine Learning Research
Cross-training: learning probabilistic mappings between topics
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Discovering evolutionary theme patterns from text: an exploration of temporal text mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining comparable bilingual text corpora for cross-language information integration
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
CWS: a comparative web search system
Proceedings of the 15th international conference on World Wide Web
A probabilistic approach to spatiotemporal theme pattern mining on weblogs
Proceedings of the 15th international conference on World Wide Web
Automatic new topic identification using multiple linear regression
Information Processing and Management: an International Journal
Identifying comparative sentences in text documents
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A mixture model for contextual text mining
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Topic sentiment mixture: modeling facets and opinions in weblogs
Proceedings of the 16th international conference on World Wide Web
Organizing the OCA: learning faceted subjects from a library of digital books
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Term feedback for information retrieval with language models
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic labeling of multinomial topic models
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining correlated bursty topic patterns from coordinated text streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Generating gene summaries from biomedical literature: A study of semi-structured summarization
Information Processing and Management: an International Journal
/*icomment: bugs or bad comments?*/
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Improve retrieval accuracy for difficult queries using negative feedback
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Topic modeling with network regularization
Proceedings of the 17th international conference on World Wide Web
Modeling online reviews with multi-grain topic models
Proceedings of the 17th international conference on World Wide Web
Opinion integration through semi-supervised topic modeling
Proceedings of the 17th international conference on World Wide Web
Mining multi-faceted overviews of arbitrary topics in a text collection
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Fuzzy Clustering for Topic Analysis and Summarization of Document Collections
CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Timeline Analysis of Web News Events
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Learning to Identify Comparative Sentences in Chinese Text
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Mining common topics from multiple asynchronous text streams
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Statistical Language Models for Information Retrieval A Critical Review
Foundations and Trends in Information Retrieval
Rated aspect summarization of short comments
Proceedings of the 18th international conference on World wide web
A sentence level probabilistic model for evolutionary theme pattern mining from news corpora
Proceedings of the 2009 ACM symposium on Applied Computing
Ranking-based clustering of heterogeneous information networks with star network schema
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Generating comparative summaries of contradictory opinions in text
Proceedings of the 18th ACM conference on Information and knowledge management
Comparative document summarization via discriminative sentence selection
Proceedings of the 18th ACM conference on Information and knowledge management
Graph clustering based on structural/attribute similarities
Proceedings of the VLDB Endowment
Finding Comparative Facts and Aspects for Judging the Credibility of Uncertain Facts
WISE '09 Proceedings of the 10th International Conference on Web Information Systems Engineering
Cross-cultural analysis of blogs and forums with mixed-collection topic models
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
A mixture model for expert finding
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
PET: a statistical model for popular events tracking in social communities
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
From bursty patterns to bursty facts: The effectiveness of temporal text mining for news
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
OpinionIt: a text mining system for cross-lingual opinion analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Identifying new categories in community question answering archives: a topic modeling approach
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Topic detection by topic model induced distance using biased initiation
AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
Content-aware resolution sequence mining for ticket routing
BPM'10 Proceedings of the 8th international conference on Business process management
Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities
ACM Transactions on Knowledge Discovery from Data (TKDD)
Bridging topic modeling and personalized search
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Geographical topic discovery and comparison
Proceedings of the 20th international conference on World wide web
The web of topics: discovering the topology of topic evolution in a corpus
Proceedings of the 20th international conference on World wide web
An analysis of perspectives in interactive settings
Proceedings of the First Workshop on Social Media Analytics
Modeling reciprocity in social interactions with probabilistic latent space models
Natural Language Engineering
On summarizing graph homogeneously
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
Comparing twitter and traditional media using topic models
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Structural topic model for latent topical structure analysis
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Comparative news summarization using linear programming
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
A game theoretic framework for heterogenous information network clustering
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
A time-dependent topic model for multiple text streams
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Content-driven trust propagation framework
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Web text clustering with dynamic themes
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Discovering intermediate entities from two examples by using web search engine indices
Proceedings of the 4th International Conference on Uniquitous Information Management and Communication
Mining contrastive opinions on political texts using cross-perspective topic model
Proceedings of the fifth ACM international conference on Web search and data mining
Find me opinion sources in blogosphere: a unified framework for opinionated blog feed retrieval
Proceedings of the fifth ACM international conference on Web search and data mining
Analyzing document collections via context-aware term extraction
NLDB'09 Proceedings of the 14th international conference on Applications of Natural Language to Information Systems
Latent Community Topic Analysis: Integration of Community Discovery with Topic Modeling
ACM Transactions on Intelligent Systems and Technology (TIST)
Perturbation of Matrices and Nonnegative Rank with a View toward Statistical Models
SIAM Journal on Matrix Analysis and Applications
Group matrix factorization for scalable topic modeling
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Expert Systems with Applications: An International Journal
Comparative document summarization via discriminative sentence selection
ACM Transactions on Knowledge Discovery from Data (TKDD)
Supervised cross-collection topic modeling
Proceedings of the 20th ACM international conference on Multimedia
Joint topic modeling for event summarization across news and social media streams
Proceedings of the 21st ACM international conference on Information and knowledge management
Learning to find comparable entities on the web
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
BiCWS: mining cognitive differences from bilingual web search results
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Comparative Document Summarization via Discriminative Sentence Selection
ACM Transactions on Knowledge Discovery from Data (TKDD)
Blog topic analysis using TF smoothing and LDA
Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Monitoring User Evolution in Twitter
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Exploiting Forum Thread Structures to Improve Thread Clustering
Proceedings of the 2013 Conference on the Theory of Information Retrieval
A partially supervised cross-collection topic model for cross-domain text classification
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Story graphs: Tracking document set evolution using dynamic graphs
Intelligent Data Analysis - Dynamic Networks and Knowledge Discovery
Hi-index | 0.00 |
In this paper, we define and study a novel text mining problem, which we refer to as Comparative Text Mining (CTM). Given a set of comparable text collections, the task of comparative text mining is to discover any latent common themes across all collections as well as summarize the similarity and differences of these collections along each common theme. This general problem subsumes many interesting applications, including business intelligence and opinion summarization. We propose a generative probabilistic mixture model for comparative text mining. The model simultaneously performs cross-collection clustering and within-collection clustering, and can be applied to an arbitrary set of comparable text collections. The model can be estimated efficiently using the Expectation-Maximization (EM) algorithm. We evaluate the model on two different text data sets (i.e., a news article data set and a laptop review data set), and compare it with a baseline clustering method also based on a mixture model. Experiment results show that the model is quite effective in discovering the latent common themes across collections and performs significantly better than our baseline mixture model.