Reasoning about knowledge from the web

  • Authors:
  • Gjergji Kasneci

  • Affiliations:
  • Hasso-Plattner-Institute, Potsdam, Germany

  • Venue:
  • ICWE'12 Proceedings of the 12th international conference on Current Trends in Web Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the presence of a vast amount of user generated content evolving around entities such as people, locations, products, events, etc., it seems that documentoriented retrieval is rather old-fashioned. Imagine an HIV-relevant search task that with the goal of finding drugs that may interfere with HIV protease inhibitors. Retrieving an exhaustive list of explicit results (i.e., drugs that may interfere with HIV protease inhibitors) can be crucial for people suffering from HIV, whose health depends on the unmediated effect of protease inhibitors. Moreover it might be desirable to have the drugs in the result list ranked by their probability of interfering with protease inhibitors. In order to automatically retrieve such an exhaustive list of ranked answers, there are two subtasks that have to be addressed: (1) knowledge about drugs that stand in an interference relationship to protease inhibitors needs to be extracted from various web pages and appropriately combined, (2) the drugs need to be ranked by their probability of interfering with protease inhibitors. Neither of these tasks can be addressed by state-of-the-art search engines. Expecting the user to manually inspect retrieved documents to construct an exhaustive list of answers is simply unrealistic. As a matter of fact, major players in the search engine industry have recognized these issues and are attempting to shift the focus towards knowledge retrieval. For example, in 2010, Google acquired Metaweb, the company behind Freebase, one of the largest knowledge bases with explicit facts about real-world entities. In 2011, Google's search group was restructured and renamed into "knowledge group" [6]. Another example is Microsoft's Bing, which has undergone similar changes in recent years. By the end of 2009 Bing was returning Wolfram Alpha results to entity-related and scholarly queries [8], and by the end 2010 Bing announced the new "health search experience" with the focus "on further enabling people to get relevant information and make better decisions" [7].