Leveraging structural knowledge for hierarchically-informed keyword weight propagation in the web

Authors:
Jong Wook Kim;K. Selçuk Candan
Affiliations:
Comp. Sci. and Eng. Dept., Arizona State University, Tempe, AZ;Comp. Sci. and Eng. Dept., Arizona State University, Tempe, AZ
Venue:
WebKDD'06 Proceedings of the 8th Knowledge discovery on the web international conference on Advances in web mining and web usage analysis
Year:
2006

Citing 19
Cited 1

Ranking schemes in hybrid Boolean systems: a new approach

Journal of the American Society for Information Science
Improved algorithms for topic distillation in a hyperlinked environment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Finding related pages in the World Wide Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
What is this page known for? Computing Web page reputations

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Extended Boolean information retrieval

Communications of the ACM
Enhanced topic distillation using text, markup tags, and hyperlinks

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Reasoning for web document associations and its applications in site map construction

Data & Knowledge Engineering
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Verbs semantics and lexical selection

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Topic segmentation of message hierarchies for indexing and navigation support

WWW '05 Proceedings of the 14th international conference on World Wide Web
Information retrieval system evaluation: effort, sensitivity, and reliability

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
A study of relevance propagation for web search

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Bidirectional expansion for keyword search on graph databases

VLDB '05 Proceedings of the 31st international conference on Very large data bases
XSEarch: a semantic search engine for XML

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Objectrank: authority-based keyword search in databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Editorial: Narrative-based taxonomy distillation for effective indexing of text collections

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although web navigation hierarchies, such as Yahoo.com and Open Directory Project, enable effective browsing, their individual nodes cannot be indexed for search independently. This is because contents of the individual nodes in a hierarchy are related to the contents of their neighbors, ancestors, and descendants in the structure. In this paper, we show that significant improvements in precision can be obtained by leveraging knowledge about the structure of hierarchical web content. In particular, we propose a novel keyword weight propagation technique to properly enrich the data nodes in web hierarchies. Our approach relies on leveraging the context provided by neighbor entries in a given structure. We leverage this information for developing relativecontent preserving keyword propagation schemes. We compare the results obtained through proposed hierarchically-informed keyword weight (pre-) propagation schemes to existing state-of-the-art score and keyword propagation techniques and show that our approach significantly improves the precision.