A domain independent framework to extract and aggregate analogous features in online reviews

  • Authors:
  • Archana Bhattarai;Nobal Niraula;Vasile Rus;King-Ip Lin

  • Affiliations:
  • Departmet of Computer Science, The University of Memphis, Memphis, TN;Departmet of Computer Science, The University of Memphis, Memphis, TN;Departmet of Computer Science, The University of Memphis, Memphis, TN;Departmet of Computer Science, The University of Memphis, Memphis, TN

  • Venue:
  • CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extracting and detecting features from online reviews is both important and challenging, especially when domain knowledge is not explicitly available. Moreover, opinions about the same feature of a product or service are frequently expressed in various lexical forms. In this paper, we present a novel framework to automatically detect, extract and aggregate semantically related features of reviewed products and services. Our model uses sentence level syntactic and lexical information to detect candidate feature words, and corpus level co-occurrence statistics to perform grouping of semantically similar features to obtain high precision feature detection. The high precision feature assembly capability of our model has a distinct advantage over state of the art approaches, like double propagation, by producing short and succinct sets of features compared to potential thousands of features that are generated by existing approaches. We evaluate our model in two completely unrelated domains, restaurant and camera online reviews, to verify its domain independence. The results of our model outperformed existing state of the art probabilistic models.