Query Relaxation by Structure and Semantics for Retrieval of Logical Web Documents

Authors:
Wen-Syan Li;K. Selçuk Candan;Quoc Vu;Divyakant Agrawal
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2002

Citing 17
Cited 5

Focus+context views of World-Wide Web nodes

HYPERTEXT '97 Proceedings of the eighth ACM conference on Hypertext
Provably good routing tree construction with multi-port terminals

Proceedings of the 1997 international symposium on Physical design
Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Cut as a querying unit for WWW, Netnews, and E-mail

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Improved algorithms for topic distillation in a hyperlinked environment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient crawling through URL ordering

WWW7 Proceedings of the seventh international conference on World Wide Web 7
PowerBookmarks: a system for personalizable Web information organization, sharing, and management

WWW '99 Proceedings of the eighth international conference on World Wide Web
A polylogarithmic approximation algorithm for the group Steiner tree problem

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Clustering Categorical Data: An Approach Based on Dynamical Systems

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
DTL's DataSpot: Database Exploration Using Plain Language

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Proximity Search in Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Beyond Steiner's Problem: A VLSI Oriented Generalization

WG '89 Proceedings of the 15th International Workshop on Graph-Theoretic Concepts in Computer Science
Bounds on the quality of approximate solutions to the Group Steiner Problem

WG '90 Proceedings of the 16rd International Workshop on Graph-Theoretic Concepts in Computer Science
Providing Government Information on the Interne: Experiences with THOMAS

Providing Government Information on the Interne: Experiences with THOMAS

Untangling compound documents on the web

Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Topic segmentation of message hierarchies for indexing and navigation support

WWW '05 Proceedings of the 14th international conference on World Wide Web
A unified interaction scheme for information sources

Journal of Intelligent Information Systems
Keyword search on external memory data graphs

Proceedings of the VLDB Endowment
SEA: Segment-enrich-annotate paradigm for adapting dialog-based content for improved accessibility

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Since the Web encourages hypertext and hypermedia document authoring (e.g., HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperlinks. A Web document may be authored in multiple ways, such as, 1) all information in one physical page, or 2) a main page and the related information in separate linked pages. Existing Web search engines, however, return only physical pages containing keywords. In this paper, we introduce the concept of information unit, which can be viewed as a logical Web document consisting of multiple physical pages as one atomic retrieval unit. We present an algorithm to efficiently retrieve information units. Our algorithm can perform progressive query processing. These functionalities are essential for information retrieval on the Web and large XML databases. We also present experimental results on synthetic graphs and real Web data.