Community-based classification of noun phrases in twitter

Authors:
Freddy Chong Tat Chua;William W. Cohen;Justin Betteridge;Ee-Peng Lim
Affiliations:
Singapore Management University, Singapore, Singapore;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Singapore Management University, Singapore, Singapore
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 11
Cited 2

Topic modeling: beyond bag-of-words

ICML '06 Proceedings of the 23rd international conference on Machine learning
Web-scale named entity recognition

Proceedings of the 17th ACM conference on Information and knowledge management
Topic and role discovery in social networks with experiments on enron and academic email

Journal of Artificial Intelligence Research
Coupled semi-supervised learning for information extraction

Proceedings of the third ACM international conference on Web search and data mining
TwitterRank: finding topic-sensitive influential twitterers

Proceedings of the third ACM international conference on Web search and data mining
Earthquake shakes Twitter users: real-time event detection by social sensors

Proceedings of the 19th international conference on World wide web
Discovering users' topics of interest on twitter: a first look

AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Recognizing named entities in tweets

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Part-of-speech tagging for Twitter: annotation, features, and experiments

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Named entity recognition in tweets: an experimental study

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
TwiNER: named entity recognition in targeted twitter stream

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Sentiment and topic analysis on social media: a multi-task multi-label classification approach

Proceedings of the 5th Annual ACM Web Science Conference
Exploiting hybrid contexts for Tweet segmentation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many event monitoring systems rely on counting known keywords in streaming text data to detect sudden spikes in frequency. But the dynamic and conversational nature of Twitter makes it hard to select known keywords for monitoring. Here we consider a method of automatically finding noun phrases (NPs) as keywords for event monitoring in Twitter. Finding NPs has two aspects, identifying the boundaries for the subsequence of words which represent the NP, and classifying the NP to a specific broad category such as politics, sports, etc. To classify an NP, we define the feature vector for the NP using not just the words but also the author's behavior and social activities. Our results show that we can classify many NPs by using a sample of training data from a knowledge-base.