Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
Mining confident co-location rules without a support threshold
Proceedings of the 2003 ACM symposium on Applied computing
Fast mining of spatial collocations
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Content based SMS spam filtering
Proceedings of the 2006 ACM symposium on Document engineering
Data Streams: Models and Algorithms (Advances in Database Systems)
Data Streams: Models and Algorithms (Advances in Database Systems)
Feature engineering for mobile (SMS) spam filtering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Mining correlated bursty topic patterns from coordinated text streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A usability comparison of three alternative message formats for an SMS banking service
International Journal of Human-Computer Studies
Proceedings of the first workshop on Online social networks
Categorizing and mining concept drifting data streams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
The impact of mobile telephony on developing country micro-enterprise: A nigerian case study
Information Technologies and International Development
Active learning with statistical models
Journal of Artificial Intelligence Research
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Natural Language Processing with Python
Natural Language Processing with Python
Short text classification in twitter to improve information filtering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Streaming first story detection with application to Twitter
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Subword variation in text message classification
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A hybrid rule/model-based finite-state framework for normalizing SMS messages
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A latent variable model for geographic lexical variation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Crisis MT: developing a cookbook for MT in crisis situations
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Short message communications: users, topics, and in-language processing
Proceedings of the 2nd ACM Symposium on Computing for Development
High performance mining of social media data
Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
Open domain event extraction from twitter
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate unsupervised joint named-entity extraction from unaligned parallel text
NEWS '12 Proceedings of the 4th Named Entity Workshop
Crowdsourcing and the crisis-affected community
Information Retrieval
Cross-lingual geo-parsing for non-structured data
Proceedings of the 7th Workshop on Geographic Information Retrieval
Hi-index | 0.01 |
Crisis-affected populations are often able to maintain digital communications but in a sudden-onset crisis any aid organizations will have the least free resources to process such communications. Information that aid agencies can actually act on, 'actionable' information, will be sparse so there is great potential to (semi)automatically identify actionable communications. However, there are hurdles as the languages spoken will often be under-resourced, have orthographic variation, and the precise definition of 'actionable' will be response-specific and evolving. We present a novel system that addresses this, drawing on 40,000 emergency text messages sent in Haiti following the January 12, 2010 earthquake, predominantly in Haitian Kreyol. We show that keyword/ngram-based models using streaming MaxEnt achieve up to F=0.21 accuracy. Further, we find current state-of-the-art subword models increase this substantially to F=0.33 accuracy, while modeling the spatial, temporal, topic and source contexts of the messages can increase this to a very accurate F=0.86 over direct text messages and F=0.90-0.97 over social media, making it a viable strategy for message prioritization.