Kosmix: high-performance topic exploration using the deep web

  • Authors:
  • Anand Rajaraman

  • Affiliations:
  • Kosmix Corporation, Mountain View, CA

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Kosmix lies at the intersection of two important trends: topic exploration and the Deep Web. Topic exploration is a new approach to information discovery on the web that satisfies certain use cases not served well by conventional web search. The Deep Web, an inhospitable region for web crawlers, is emerging as a significant information resource. We describe the anatomy of Kosmix, the first general-purpose topic exploration engine to harness the Deep Web using a federated search approach. We focus in particular on the Kosmix approach to query tranformation and caching, which is essential to ensure reasonable performance.