Using navigation data to improve IR functions in the context of web search

Authors:
Mark H. Hansen;Elizabeth Shriver
Affiliations:
Bell Laboratories, Murray Hill, NJ;Bell Laboratories, Murray Hill, NJ
Venue:
Proceedings of the tenth international conference on Information and knowledge management
Year:
2001

Citing 13
Cited 11

Learning collection fusion strategies

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Fab: content-based, collaborative recommendation

Communications of the ACM
Multiple search engines in database merging

DL '97 Proceedings of the second ACM international conference on Digital libraries
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Finding related pages in the World Wide Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
Clustering hypertext with applications to web searching

HYPERTEXT '00 Proceedings of the eleventh ACM on Hypertext and hypermedia
Capturing human intelligence in the net

Communications of the ACM
The stochastic approach for link-structure analysis (SALSA) and the TKC effect

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Information retrieval on the web

ACM Computing Surveys (CSUR)
On-line EM Algorithm for the Normalized Gaussian Network

Neural Computation

Evaluation of web usage mining approaches for user's next request prediction

WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Co-active intelligence for image retrieval

Proceedings of the 13th annual ACM international conference on Multimedia
How are we searching the world wide web?: a comparison of nine search engine transaction logs

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Defining a session on Web search engines: Research Articles

Journal of the American Society for Information Science and Technology
A survey on session detection methods in query logs and a proposal for future evaluation

Information Sciences: an International Journal
Query-URL bipartite based approach to personalized query recommendation

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
How are we searching the World Wide Web? A comparison of nine search engine transaction logs

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
How to define searching sessions on web search engines

WebKDD'06 Proceedings of the 8th Knowledge discovery on the web international conference on Advances in web mining and web usage analysis
An efficient user-oriented clustering of web search results

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part III
Improving large-scale search engines with semantic annotations

Expert Systems with Applications: An International Journal
QUBiC: An adaptive approach to query-based recommendation

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

As part of the process of delivering content, devices like proxies and gateways log valuable information about the activities and navigation patterns of users on the Web. In this study, we consider how this navigation data can be used to improve Web search. A query posted to a search engine together with the set of pages accessed during a search task is known as a search session. We develop a mixture model for the observed set of search sessions, and propose variants of the classical EM algorithm for training. The model itself yields a type of navigation-based query clustering. By implicitly borrowing strength between related queries, the mixture formulation allows us to identify the "highly relevant" URLs for each query cluster. Next, we explore methods for incorporating existing labeled data (the Yahoo! directory, for example) to speed convergence and help resolve low-traffic clusters. Finally, the mixture formulation also provides for a simple, hierarchical display of search results based on the query clusters. The effectiveness of our approach is evaluated using proxy access logs for the outgoing Lucent proxy.