Deriving query intents from web search engine queries

  • Authors:
  • Dirk Lewandowski;Jessica Drechsler;Sonja von Mach

  • Affiliations:
  • Hamburg University of Applied Sciences, Department of Information, Finkenau 35, D—22081, Hamburg, Germany;Hamburg University of Applied Sciences, Department of Information, Finkenau 35, D—22081, Hamburg, Germany;Hamburg University of Applied Sciences, Department of Information, Finkenau 35, D—22081, Hamburg, Germany

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The purpose of this article is to test the reliability of query intents derived from queries, either by the user who entered the query or by another juror. We report the findings of three studies. First, we conducted a large-scale classification study (~50,000 queries) using a crowdsourcing approach. Next, we used clickthrough data from a search engine log and validated the judgments given by the jurors from the crowdsourcing study. Finally, we conducted an online survey on a commercial search engine's portal. Because we used the same queries for all three studies, we also were able to compare the results and the effectiveness of the different approaches. We found that neither the crowdsourcing approach, using jurors who classified queries originating from other users, nor the questionnaire approach, using searchers who were asked about their own query that they just entered into a Web search engine, led to satisfying results. This leads us to conclude that there was little understanding of the classification tasks, even though both groups of jurors were given detailed instructions. Although we used manual classification, our research also has important implications for automatic classification. We must question the success of approaches using automatic classification and comparing its performance to a baseline from human jurors. © 2012 Wiley Periodicals, Inc.