Statement map: reducing web information credibility noise through opinion classification

  • Authors:
  • Koji Murakami;Eric Nichols;Junta Mizuno;Yotaro Watanabe;Shouko Masuda;Hayato Goto;Megumi Ohki;Chitose Sao;Suguru Matsuyoshi;Kentaro Inui;Yuji Matsumoto

  • Affiliations:
  • Nara Institute Science and Technology, Ikoma, Nara, Japan;Tohoku University, Sendai, Miyagi, Japan;Nara Institute Science and Technology, Ikoma, Nara, Japan;Tohoku University, Sendai, Miyagi, Japan;Osaka Prefecture University, Osaka, Osaka, Japan;Nara Institute Science and Technology, Ikoma, Nara, Japan;Nara Institute Science and Technology, Ikoma, Nara, Japan;Nara Institute Science and Technology, Ikoma, Nara, Japan;Nara Institute Science and Technology, Ikoma, Nara, Japan;Tohoku University, Sendai, Miyagi, Japan;Nara Institute Science and Technology, Ikoma, Nara, Japan

  • Venue:
  • AND '10 Proceedings of the fourth workshop on Analytics for noisy unstructured text data
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

On the Internet, users often encounter noise in the form of spelling errors or unknown words, however, dishonest, unreliable, or biased information also acts as noise that makes it difficult to find credible sources of information. As people come to rely on the Internet for more and more information, reducing this credibility noise grows ever more urgent. The STATEMENT MAP project's goal is to help Internet users evaluate the credibility of information sources by mining the Web for a variety of viewpoints on their topics of interest and presenting them to users together with supporting evidence in a way that makes it clear how they are related. In this paper, we show how a STATEMENT MAP system can be constructed by combining Information Retrieval (IR) and Natural Language Processing (NLP) technologies, focusing on the task of organizing statements retrieved from the Web by viewpoints. We frame this as a semantic relation classification task, and identify 4 semantic relations: [AGREEMENT], [CONFLICT], [CONFINEMENT], and [EVIDENCE]. The former two relations are identified by measuring semantic similarity through sentence alignment, while the latter two are identified through sentence-internal discourse processing. As a prelude to end-to-end user evaluation of STATEMENT MAP, we present a large-scale evaluation of semantic relation classification between user queries and Internet texts in Japanese and conduct detailed error analysis to identify the remaining areas of improvement.