Open domain event extraction from twitter

Authors:
Alan Ritter; Mausam;Oren Etzioni;Sam Clark
Affiliations:
University of Washington, Seattle, WA, USA;University of Washington, Seattle, WA, USA;University of Washington, Seattle, WA, USA;Decide, Inc, Seattle, WA, USA
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 44
Cited 13

A study of retrospective and on-line event detection

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On-line new event detection and tracking

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Newsjunkie: providing personalized newsfeeds via analysis of information novelty

Proceedings of the 13th international conference on World Wide Web
Message Understanding Conference-6: a brief history

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Text classification and named entities for new event detection

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Robust temporal processing of news

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Machine learning of temporal relations

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Evita: a robust event recognizer for QA systems

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Using Wikipedia to bootstrap open information extraction

ACM SIGMOD Record
Meme-tracking and the dynamics of the news cycle

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient methods for topic model inference on streaming document collections

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Classifying temporal relations between events

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Identification of event mentions and their semantic class

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Earthquake shakes Twitter users: real-time event detection by social sensors

Proceedings of the 19th international conference on World wide web
PET: a statistical model for popular events tracking in social communities

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic evaluation of topic coherence

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised modeling of Twitter conversations

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Streaming first story detection with application to Twitter

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A latent dirichlet allocation method for selectional preferences

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Learning arguments and supertypes of semantic relations using recursive patterns

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Tense interpretation in the context of narrative

AAAI'91 Proceedings of the ninth National conference on Artificial intelligence - Volume 1
A latent variable model for geographic lexical variation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Dynamic relationship and event discovery

Proceedings of the fourth ACM international conference on Web search and data mining
Extracting events and event descriptions from Twitter

Proceedings of the 20th international conference companion on World wide web
Mark my words!: linguistic style accommodation in social media

Proceedings of the 20th international conference on World wide web
Recognizing named entities in tweets

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Event discovery in social media feeds

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Template-based information extraction without the templates

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Discovering sociolinguistic associations with structured sparsity

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Automatic labelling of topic models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Part-of-speech tagging for Twitter: annotation, features, and experiments

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Subword and spatiotemporal models for identifying actionable information in Haitian Kreyol

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Smoothing techniques for adaptive online language models: topic tracking in tweet streams

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Crisis MT: developing a cookbook for MT in crisis situations

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Data-driven response generation in social media

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Linguistic redundancy in Twitter

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised learning of selectional restrictions and detection of argument coercions

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Structured relation discovery using generative models

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Named entity recognition in tweets: an experimental study

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Identifying relations for open information extraction

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Towards context-aware search and analysis on social media data

Proceedings of the 16th International Conference on Extending Database Technology
Exploiting hybrid contexts for Tweet segmentation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Event identification for local areas using social media streaming data

Proceedings of the ACM SIGMOD Workshop on Databases and Social Networks
Who, where, when and what: discover spatio-temporal topics for twitter users

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Collaborative boosting for activity classification in microblogs

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
KeySee: supporting keyword search on evolving events in social streams

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Live topic generation from event streams

Proceedings of the 22nd international conference on World Wide Web companion
ET: events from tweets

Proceedings of the 22nd international conference on World Wide Web companion
A Twitter-based smoking cessation recruitment system

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Chelsea won, and you bought a t-shirt: characterizing the interplay between Twitter and e-commerce

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
How the live web feels about events

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Summaries on the fly: query-based extraction of structured knowledge from web documents

ICWE'13 Proceedings of the 13th international conference on Web Engineering
Timeline generation: tracking individuals on twitter

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tweets are the most up-to-date and inclusive stream of in- formation and commentary on current events, but they are also fragmented and noisy, motivating the need for systems that can extract, aggregate and categorize important events. Previous work on extracting structured representations of events has focused largely on newswire text; Twitter's unique characteristics present new challenges and opportunities for open-domain event extraction. This paper describes TwiCal-- the first open-domain event-extraction and categorization system for Twitter. We demonstrate that accurately extracting an open-domain calendar of significant events from Twitter is indeed feasible. In addition, we present a novel approach for discovering important event categories and classifying extracted events based on latent variable models. By leveraging large volumes of unlabeled data, our approach achieves a 14% increase in maximum F1 over a supervised baseline. A continuously updating demonstration of our system can be viewed at http://statuscalendar.com; Our NLP tools are available at http://github.com/aritter/ twitter_nlp.