A tutorial on learning with Bayesian networks
Proceedings of the NATO Advanced Study Institute on Learning in graphical models
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Effective Methods for Improving Naive Bayes Text Classifiers
PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Ensembles of Region Based Classifiers
CIT '07 Proceedings of the 7th IEEE International Conference on Computer and Information Technology
Web People Search via Connection Analysis
IEEE Transactions on Knowledge and Data Engineering
Exploiting context analysis for combining multiple entity resolution systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Twitter power: Tweets as electronic word of mouth
Journal of the American Society for Information Science and Technology
Gathering and ranking photos of named entities with high precision, high recall, and diversity
Proceedings of the third ACM international conference on Web search and data mining
Short text classification in twitter to improve information filtering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
From web data to entities and back
CAiSE'10 Proceedings of the 22nd international conference on Advanced information systems engineering
Quality-aware similarity assessment for entity matching in Web data
Information Systems
Searching for spam: detecting fraudulent accounts via web search
PAM'13 Proceedings of the 14th international conference on Passive and Active Measurement
Hi-index | 0.00 |
Twitter is a micro-blogging service on the Web, where people can enter short messages, which then become visible to other users of the service. While the topics of these messages varies, there are a lot of messages where the users express their opinions about companies or products. Since the twitter service is very popular, the messages form a rich source of information for companies. They can learn with the help of data mining and sentiment analysis techniques, how their customers like their products or what is the general perception of the company. There is however a great obstacle for analyzing the data directly: as the company names are often ambiguous, one needs first to identify, which messages are related to the company. In this paper we address this question. We present various techniques to classify tweet messages, whether they are related to a given company or not, for example, whether a message containing the keyword "apple" is about the company Apple Inc.. We present simple techniques, which make use of company profiles, which we created semi-automatically from external Web sources. Our advanced techniques take ambiguity estimations into account and also automatically extend the company profiles from the twitter stream itself. We demonstrate the effectiveness of our methods through an extensive set of experiments.