Keyword spices: a new method for building domain-specific web search engines

  • Authors:
  • Satoshi Oyama;Takashi Kokubo;Toru Ishida;Teruhiro Yamada;Yasuhiko Kitamura

  • Affiliations:
  • Department of Social Informatics, Kyoto University, Kyoto, Japan;NTT Docomo, Inc. and Department of Social Informatics, Kyoto University, Kyoto, Japan;Department of Social Informatics, Kyoto University, Kyoto, Japan;SANYO Electric Co.,Ltd. and Laboratories of Image Information Science and Technology;Department of Information and Communication Engineering, Osaka City University, Osaka, Japan

  • Venue:
  • IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new method for building domain-specific web search engines. Previous methods eliminate irrelevant documents from the pages accessed using heuristics based on human knowledge about the domain in question. Accordingly, they are hard to build and can not be applied to other domains. The keyword spice method, in contrast, improves search performance by adding domain-specific keywords, called keyword spices, to the user's input query; the modified query is then forwarded to a general-purpose search engine. Keyword spices can be effectively discovered automatically from web documents allowing us to build high quality domain-specific search engines in various domains without requiring the collection of heuristic knowledge. We describe a machine learning algorithm, which is a type of decision-tree learning algorithm, that can extract keyword spices. To demonstrate the value of the proposed approach, we conduct experiments in the domain of cooking. The results confirm the excellent performance of our method in terms of both precision and recall.