Mining opinions from the Web: Beyond relevance retrieval

  • Authors:
  • Lun-Wei Ku;Hsin-Hsi Chen

  • Affiliations:
  • Department of Computer Science and Information Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan;Department of Computer Science and Information Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Documents discussing public affairs, common themes, interesting products, and so on, are reported and distributed on the Web. Positive and negative opinions embedded in documents are useful references and feedbacks for governments to improve their services, for companies to market their products, and for customers to purchase their objects. Web opinion mining aims to extract, summarize, and track various aspects of subjective information on the Web. Mining subjective information enables traditional information retrieval (IR) systems to retrieve more data from human viewpoints and provide information with finer granularity. Opinion extraction identifies opinion holders, extracts the relevant opinion sentences, and decides their polarities. Opinion summarization recognizes the major events embedded in documents and summarizes the supportive and the nonsupportive evidence. Opinion tracking captures subjective information from various genres and monitors the developments of opinions from spatial and temporal dimensions. To demonstrate and evaluate the proposed opinion mining algorithms, news and bloggers' articles are adopted. Documents in the evaluation corpora are tagged in different granularities from words, sentences to documents. In the experiments, positive and negative sentiment words and their weights are mined on the basis of Chinese word structures. The f-measure is 73.18% and 63.75% for verbs and nouns, respectively. Utilizing the sentiment words mined together with topical words, we achieve f-measure 62.16% at the sentence level and 74.37% at the document level. © 2007 Wiley Periodicals, Inc.