Contextualization using hyperlinks and internal hierarchical structure of Wikipedia documents

Authors:
Muhammad Ali Norozi;Paavo Arvola;Arjen P. de Vries
Affiliations:
Norwegian University of Science and Technology, Trondheim, Norway;University of Tampere, Tampere, Finland;Centrum Wiskunde & Informatica, Amsterdam, Netherlands
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 18
Cited 3

Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Extrapolation methods for accelerating PageRank computations

WWW '03 Proceedings of the 12th international conference on World Wide Web
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Generalized contextualization method for XML information retrieval

Proceedings of the 14th ACM international conference on Information and knowledge management
Evaluation in (XML) information retrieval: expected precision-recall with user modelling (EPRUM)

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Strict and vague interpretation of XML-retrieval queries

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating the effectiveness of content-oriented XML retrieval methods

Information Retrieval
eXtended cumulated gain measures for the evaluation of content-oriented XML retrieval

ACM Transactions on Information Systems (TOIS)
Objectrank: authority-based keyword search in databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Dynamic Element Retrieval in the Wikipedia Collection

Focused Access to XML Documents
The effect of contextualization at different granularity levels in content-oriented xml retrieval

Proceedings of the 17th ACM conference on Information and knowledge management
Is Wikipedia link structure different?

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Reciprocal rank fusion outperforms condorcet and individual rank learning methods

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
The importance of link evidence in Wikipedia

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Overview of the INEX 2009 link the wiki track

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Contextualization models for XML retrieval

Information Processing and Management: an International Journal
Overview of the INEX 2010 ad hoc track

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Faster Ranking Using Extrapolation Techniques

International Journal of Computer Vision and Image Processing

Kinship contextualization: utilizing the preceding and following structural elements

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Selection fusion in semi-structured retrieval

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Position-based contextualization for passage retrieval

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Context surrounding hyperlinked semi-structured documents, externally in the form of citations and internally in the form of hierarchical structure, contains a wealth of useful but implicit evidence about a document's relevance. These rich sources of information should be exploited as contextual evidence. This paper proposes various methods of accumulating evidence from the context, and measures the effect of contextual evidence on retrieval effectiveness for document and focused retrieval of hyperlinked semi-structured documents. We propose a re-weighting model to contextualize (a) evidence from citations in a query-independent and query-dependent fashion (based on Markovian random walks) and (b) evidence accumulated from the internal tree structure of documents. The in-links and out-links of a node in the citation graph are used as external context, while the internal document structure provides internal, within-document context. We hypothesize that documents in a good context (having strong contextual evidence) should be good candidates to be relevant to the posed query, and vice versa. We tested several variants of contextualization and verified notable improvements in comparison with the baseline system and gold standards in the retrieval of full documents and focused elements.