Mining and summarizing customer reviews
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting product features and opinions from reviews
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Mining opinion features in customer reviews
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Studying the effects of noisy text on text mining applications
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
Mining opinions from messenger
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
Mining reputation of person/product from comment and reply on UCC/internet article
ICIC'10 Proceedings of the 6th international conference on Advanced intelligent computing theories and applications: intelligent computing
Building reputation and trust using federated search and opinion mining
Proceedings of the 21st international conference companion on World Wide Web
Building reputation and trust using federated search and opinion mining
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
The proliferation of Internet has not only generated huge volumes of unstructured information in the form of web documents, but a large amount of text is also generated in the form of emails, blogs, and feedbacks etc. The data generated from online communication acts as potential gold mines for discovering knowledge. Text analytics has matured and is being successfully employed to mine important information from unstructured text documents. Most of these techniques use Natural Language Processing techniques which assume that the underlying text is clean and correct. Statistical techniques, though not as accurate as linguistic mechanisms, are also employed for the purpose to overcome the dependence on clean text. The chief bottleneck for designing statistical mechanisms is however its dependence on appropriately annotated training data. None of these methodologies are suitable for mining information from online communication text data due to the fact that they are often noisy. These texts are informally written. They suffer from spelling mistakes, grammatical errors, improper punctuation and irrational capitalization. This paper focuses on opinion extraction from noisy text data. It is aimed at extracting and consolidating opinions of customers from blogs and feedbacks, at multiple levels of granularity. Ours is a hybrid approach, in which we initially employ a semi-supervised method to learn domain knowledge from a training repository which contains both noisy and clean text. Thereafter we employ localized linguistic techniques to extract opinion expressions from noisy text. We have developed a system based on this approach, which provides the user with a platform to analyze opinion expressions extracted from a repository.