Employing web mining and data fusion to improve weak ad hoc retrieval

  • Authors:
  • Kui-Lam Kwok;Laszlo Grunfeld;Peter Deng

  • Affiliations:
  • Computer Science Department, Queens College, City University of New York, Flusihing, NY;Computer Science Department, Queens College, City University of New York, Flusihing, NY;Computer Science Department, Queens College, City University of New York, Flusihing, NY

  • Venue:
  • Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

When a user issues a reasonable query to a retrieval system and obtains no relevant documents, he or she is bound to feel frustrated. We call these weak queries and retrievals. Improving their effectiveness is an important issue for ad hoc retrieval and would be most rewarding for these users. We explain why data fusion of sufficiently dissimilar retrieval lists can improve weak query results and confirm this with experiments using short and medium size queries. To realize sufficiently dissimilar retrieval lists, we propose composing alternate queries through web search and mining, employ them for target retrieval, and combine with the original query retrieval list. Methods of forming web probes from longer queries, including salient term selection and query text window rotation, are investigated. When compared with normal ad hoc retrieval, web assistance and data fusion can more than double the original weak query effectiveness. Other queries can also improve along with weak ones.