Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
The Journal of Machine Learning Research
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Effects of adjective orientation and gradability on sentence subjectivity
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Mining and summarizing customer reviews
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Computational Linguistics
Thumbs up?: sentiment classification using machine learning techniques
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Extracting product features and opinions from reviews
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Recognizing contextual polarity in phrase-level sentiment analysis
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Modeling online reviews with multi-grain topic models
Proceedings of the 17th international conference on World Wide Web
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Mining opinion features in customer reviews
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Joint sentiment/topic model for sentiment analysis
Proceedings of the 18th ACM conference on Information and knowledge management
Extracting opinions, opinion holders, and topics expressed in online news media text
SST '06 Proceedings of the Workshop on Sentiment and Subjectivity in Text
Expanding domain sentiment lexicon through double propagation
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
IEEE Transactions on Knowledge and Data Engineering
Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Extracting opinion targets in a single- and cross-domain setting with conditional random fields
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Structure-aware review mining and summarization
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Aspect and sentiment unification model for online review analysis
Proceedings of the fourth ACM international conference on Web search and data mining
LTP: a Chinese Language Technology Platform
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations
Extracting and ranking product features in opinion documents
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Opinion word expansion and target extraction through double propagation
Computational Linguistics
Exploring weakly supervised latent sentiment explanations for aspect-level review analysis
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
Feature-based opinion analysis has attracted extensive attention recently. Identifying features associated with opinions expressed in reviews is essential for fine-grained opinion mining. One approach is to exploit the dependency relations that occur naturally between features and opinion words, and among features (or opinion words) themselves. In this paper, we propose a generalized approach to opinion feature extraction by incorporating robust statistical association analysis in a bootstrapping framework. The new approach starts with a small set of feature seeds, on which it iteratively enlarges by mining feature-opinion, feature-feature, and opinion-opinion dependency relations. Two association model types, namely likelihood ratio tests (LRT) and latent semantic analysis (LSA), are proposed for computing the pair-wise associations between terms (features or opinions). We accordingly propose two robust bootstrapping approaches, LRTBOOT and LSABOOT, both of which need just a handful of initial feature seeds to bootstrap opinion feature extraction. We benchmarked LRTBOOT and LSABOOT against existing approaches on a large number of real-life reviews crawled from the cellphone and hotel domains. Experimental results using varying number of feature seeds show that the proposed association-based bootstrapping approach significantly outperforms the competitors. In fact, one seed feature is all that is needed for LRTBOOT to significantly outperform the other methods. This seed feature can simply be the domain feature, e.g., "cellphone" or "hotel". The consequence of our discovery is far reaching: starting with just one feature seed, typically just the domain concept word, LRTBOOT can automatically extract a large set of high-quality opinion features from the corpus without any supervision or labeled features. This means that the automatic creation of a set of domain features is no longer a pipe dream!