Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Machine Learning
Learning in the presence of concept drift and hidden contexts
Machine Learning
BOAT—optimistic decision tree construction
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental Learning with Partial Instance Memory
ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Selecting the right interestingness measure for association patterns
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate decision trees for mining high-speed data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Memory issues in frequent itemset mining
Proceedings of the 2004 ACM symposium on Applied computing
Thumbs up?: sentiment classification using machine learning techniques
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Tackling concept drift by temporal inductive transfer
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
StreamMiner: a classifier ensemble-based engine to mine concept-drifting data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts
The Journal of Machine Learning Research
Categorizing and mining concept drifting data streams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Active Learning from Data Streams
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Opinion Mining and Sentiment Analysis
Foundations and Trends in Information Retrieval
Micro-blogging as online word of mouth branding
CHI '09 Extended Abstracts on Human Factors in Computing Systems
New ensemble methods for evolving data streams
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Issues in evaluation of stream learning algorithms
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Using emoticons to reduce dependency in machine learning techniques for sentiment classification
ACLstudent '05 Proceedings of the ACL Student Research Workshop
Twitter power: Tweets as electronic word of mouth
Journal of the American Society for Information Science and Technology
Characterizing debate performance via aggregated twitter sentiment
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Semi-Supervised Learning
Sentiment knowledge discovery in twitter streaming data
DS'10 Proceedings of the 13th international conference on Discovery science
Classifier and Cluster Ensembles for Mining Concept Drifting Data Streams
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Sentiment analysis on twitter data for portuguese language
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Named entity disambiguation in streaming data
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Towards social imagematics: sentiment analysis in social multimedia
Proceedings of the Thirteenth International Workshop on Multimedia Data Mining
Exploring weakly supervised latent sentiment explanations for aspect-level review analysis
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Sentiment analysis on evolving social streams: how self-report imbalances can help
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
How do we analyze sentiments over a set of opinionated Twitter messages? This issue has been widely studied in recent years, with a prominent approach being based on the application of classification techniques. Basically, messages are classified according to the implicit attitude of the writer with respect to a query term. A major concern, however, is that Twitter (and other media channels) follows the data stream model, and thus the classifier must operate with limited resources, including labeled data for training classification models. This imposes serious challenges for current classification techniques, since they need to be constantly fed with fresh training messages, in order to track sentiment drift and to provide up-to-date sentiment analysis. We propose solutions to this problem. The heart of our approach is a training augmentation procedure which takes as input a small training seed, and then it automatically incorporates new relevant messages to the training data. Classification models are produced on-the-fly using association rules, which are kept up-to-date in an incremental fashion, so that at any given time the model properly reflects the sentiments in the event being analyzed. In order to track sentiment drift, training messages are projected on a demand driven basis, according to the content of the message being classified. Projecting the training data offers a series of advantages, including the ability to quickly detect trending information emerging in the stream. We performed the analysis of major events in 2010, and we show that the prediction performance remains about the same, or even increases, as the stream passes and new training messages are acquired. This result holds for different languages, even in cases where sentiment distribution changes over time, or in cases where the initial training seed is rather small. We derive lower-bounds for prediction performance, and we show that our approach is extremely effective under diverse learning scenarios, providing gains that range from 7% to 58%.