Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Learning in the presence of concept drift and hidden contexts
Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Detecting Concept Drift with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Augmenting Naive Bayes Classifiers with Statistical Language Models
Information Retrieval
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
Fast and space efficient string kernels using suffix arrays
ICML '06 Proceedings of the 23rd international conference on Machine learning
Tackling concept drift by temporal inductive transfer
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Temporal Data Mining in Dynamic Feature Spaces
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Learning drifting concepts: Example selection vs. example weighting
Intelligent Data Analysis
Understanding temporal aspects in document classification
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Boosting classifiers for drifting concepts
Intelligent Data Analysis - Knowlegde Discovery from Data Streams
Local likelihood modeling of temporal text streams
Proceedings of the 25th international conference on Machine learning
Exploiting temporal contexts in text classification
Proceedings of the 17th ACM conference on Information and knowledge management
An adaptive personalized news dissemination system
Journal of Intelligent Information Systems
Linear Suffix Array Construction by Almost Pure Induced-Sorting
DCC '09 Proceedings of the 2009 Data Compression Conference
ECUE: A Spam Filter that Uses Machine Learning to Track Concept Drift
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Tracking recurring contexts using ensemble classifiers: an application to email filtering
Knowledge and Information Systems
Earthquake shakes Twitter users: real-time event detection by social sensors
Proceedings of the 19th international conference on World wide web
Temporally-aware algorithms for document classification
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Short text classification in twitter to improve information filtering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Topic classification in social media using metadata from hyperlinked objects
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Target-dependent Twitter sentiment classification
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
IEEE Transactions on Information Theory
Dealing with concept drift and class imbalance in multi-label stream classification
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Sentiment and topic analysis on social media: a multi-task multi-label classification approach
Proceedings of the 5th Annual ACM Web Science Conference
Event identification for local areas using social media streaming data
Proceedings of the ACM SIGMOD Workshop on Databases and Social Networks
Steeler nation, 12th man, and boo birds: classifying Twitter user interests using time series
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Classifying microblogs for disasters
Proceedings of the 18th Australasian Document Computing Symposium
Multi-modal distance metric learning
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
We propose a classification model of tweet streams in Twitter, which are representative of document streams whose statistical properties will change over time. Our model solves several problems that hinder the classification of tweets; in particular, the problem that the probabilities of word occurrence change at different rates for different words. Our model switches between two probability estimates based on full and recent data for each word when detecting changes in word probability. This switching enables our model to achieve both accurate learning of stationary words and quick response to bursty words. We then explain how to implement our model by using a word suffix array, which is a full-text search index. Using the word suffix array allows our model to handle the temporal attributes of word n-grams effectively. Experiments on three tweet data sets demonstrate that our model offers statistically significant higher topic-classification accuracy than conventional temporally-aware classification models.