Improving web search by the identification of contextual information

Authors:
Fernando Aguiar
Affiliations:
Dept. of Networks, Information and Multimedia, École Nationale Supérieure des Mines de Saint-Etienne, 158 cours Fauriel, F-42023 Saint-Etienne, France
Venue:
Intelligent exploration of the web
Year:
2003

Citing 25
Cited 1

Structural analysis of hypertexts: identifying hierarchies and useful metrics

ACM Transactions on Information Systems (TOIS)
HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering

Proceedings of the the seventh ACM conference on Hypertext
Silk from a sow's ear: extracting usable structures from the Web

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Enhanced hypertext categorization using hyperlinks

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Cut as a querying unit for WWW, Netnews, and E-mail

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
Finding context paths for Web pages

Proceedings of the tenth ACM Conference on Hypertext and hypermedia : returning to our diverse roots: returning to our diverse roots
Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient crawling through URL ordering

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Finding related pages in the World Wide Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
Trawling the Web for emerging cyber-communities

WWW '99 Proceedings of the eighth international conference on World Wide Web
Mirror, mirror on the Web: a study of host pairs with replicated content

WWW '99 Proceedings of the eighth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Defining logical domains in a web site

HYPERTEXT '00 Proceedings of the eleventh ACM on Hypertext and hypermedia
A practical hypertext catergorization method using links and incrementally available class information

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
What is this page known for? Computing Web page reputations

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
A comparison of techniques to find mirrored hosts on the WWW

Journal of the American Society for Information Science
Retrieving and organizing web pages by “information unit”

Proceedings of the 10th international conference on World Wide Web
Information Retrieval

Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Computing Geographical Scopes of Web Resources

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
An Interactive Classification of Web Documents by Self-Organizing Maps and Search Engines

DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications

Exploiting site-level information to improve web search

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The work presented in this chapter suggests a new model of Information Retrieval System to search for information in hypertexts underlying Web sites. The model is based on the construction of a 2-level index. One level concerns the HTML pages individually. The other one concerns the context of these pages. In this work we assume that the textual content of a HTML page is not sufficient for a indexing process to grasp the information the page conveys. Contextual information is located in complementary pages. Complementary pages for a given page are identified with the help of a complementary measure. This measure is based both on content and link analysis and assesses how complementary two pages are. By the use of both local and contextual information when indexing pages, the quality of their index is improved and so is the effectiveness of the search engine.