The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Making large-scale support vector machine learning practical
Advances in kernel methods
The Journal of Machine Learning Research
Generating natural language summaries from multiple on-line sources
Computational Linguistics - Special issue on natural language generation
Discovery of inference rules for question-answering
Natural Language Engineering
More accurate tests for the statistical significance of result differences
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Mining and summarizing customer reviews
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Information fusion in the context of multi-document summarization
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Opinion observer: analyzing and comparing opinions on the Web
WWW '05 Proceedings of the 14th international conference on World Wide Web
Multidocument summarization via information extraction
HLT '01 Proceedings of the first international conference on Human language technology research
Statistical significance of MUC-6 results
MUC6 '95 Proceedings of the 6th conference on Message understanding
Extracting paraphrases from a parallel corpus
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Incorporating non-local information into information extraction systems by Gibbs sampling
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Unsupervised topic modelling for multi-party spoken discourse
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Contextual dependencies in unsupervised word segmentation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting product features and opinions from reviews
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Mark-up Barking Up the Wrong Tree
Computational Linguistics
Automatic identification of pro and con reasons in online reviews
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Modeling online reviews with multi-grain topic models
Proceedings of the 17th international conference on World Wide Web
Opinion integration through semi-supervised topic modeling
Proceedings of the 17th international conference on World Wide Web
NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
Multi-document summarization by graph search and matching
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
The PASCAL recognising textual entailment challenge
MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
A hybrid hierarchical model for multi-document summarization
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Incorporating content structure into text analysis applications
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
International Journal of Mobile Human Computer Interaction
Hi-index | 0.00 |
This paper presents a new method for inferring the semantic properties of documents by leveraging free-text keyphrase annotations. Such annotations are becoming increasingly abundant due to the recent dramatic growth in semi-structured, user-generated online content. One especially relevant domain is product reviews, which are often annotated by their authors with pros/cons keyphrases such as "a real bargain" or "good value." These annotations are representative of the underlying semantic properties; however, unlike expert annotations, they are noisy: lay authors may use different labels to denote the same property, and some labels may be missing. To learn using such noisy annotations, we find a hidden paraphrase structure which clusters the keyphrases. The paraphrase structure is linked with a latent topic model of the review texts, enabling the system to predict the properties of unannotated documents and to effectively aggregate the semantic properties of multiple reviews. Our approach is implemented as a hierarchical Bayesian model with joint inference. We find that joint inference increases the robustness of the keyphrase clustering and encourages the latent topics to correlate with semantically meaningful properties. Multiple evaluations demonstrate that our model substantially outperforms alternative approaches for summarizing single and multiple documents into a set of semantically salient keyphrases.