Extracting Relevant Snippets fromWeb Documents through Language Model based Text Segmentation

Authors:
Qing Li;K. Selcuk Candan;Yan Qi
Affiliations:
-;-;-
Venue:
WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Year:
2007

Citing 0
Cited 4

Extraction of the contents in the web texts by content-density distribution

International Journal of Knowledge Engineering and Soft Data Paradigms
Extraction of web texts using content-density distribution

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Gem-based entity-knowledge maintenance

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Towards improving the online shopping experience: A client-based platform for post-processing Web search results

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extracting a query-oriented snippet (or passage) and highlighting the relevant information in long document can help reduce the result navigation cost of end users. While the traditional approach of highlighting matching keywords helps when the search is keyword oriented, finding appropriate snippets to represent matches to more complex queries requires novel techniques that can help characterize the relevance of various parts of a document to the given query, succinctly. In this paper, we present a languagemodel based method for accurately detecting the most relevant passages of a given document. Unlike previous works in passage retrieval which focus on searching relevance nodes for filtering of preoccupied passages, we focus on query-informed segmentation for snippet extraction. The algorithms presented in this paper are currently being deployed in OASIS, a system to help reduce the navigational load of blind users in accessing Web-based digital libraries.