An indexing model of HTML documents

Authors:
Andrea Molinari;Gabriella Pasi;R. A. Marques Pereira
Affiliations:
University of Trento - Via Inama 5, 38100 Trento Italy;National Council of Research (ITIM-CNR), Via Ampère, 56, Milano, Italy;University of Trento, Via Inama 5, 38100 Trento Italy
Venue:
Proceedings of the 2003 ACM symposium on Applied computing
Year:
2003

Citing 18
Cited 3

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Effective retrieval of structured documents

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
An extended vector-processing scheme for searching information in hypertext systems

Information Processing and Management: an International Journal
ParaSite: mining structural information on the Web

Selected papers from the sixth international conference on World Wide Web
WebQuery: searching and visualizing the Web through connectivity

Selected papers from the sixth international conference on World Wide Web
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
“Is this document relevant?…probably”: a survey of probabilistic models in information retrieval

ACM Computing Surveys (CSUR)
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Information retrieval on the web

ACM Computing Surveys (CSUR)
Enhanced topic distillation using text, markup tags, and hyperlinks

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval

Information Retrieval
Lectures on information retrieval

Lectures on information retrieval
Modeling vagueness in information retrieval

Lectures on information retrieval
Information retrieval and structured documents

Lectures on information retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Using the structure of HTML documents to improve retrieval

USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems

FuzzyFresh: A Fuzzy Logic Approach to the Ranking of Structured Documents

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Modelling field dependencies on structured documents with fuzzy logic

FUZZ-IEEE'09 Proceedings of the 18th international conference on Fuzzy Systems
Web document modeling

The adaptive web

Quantified Score

Hi-index	0.00

Visualization

Abstract

The diffusion of the World Wide Web and the consequent increase in the production and exchange of textual information demand the development of effective information retrieval systems. The HyperText Markup Language (HTML) is broadly employed for defining the "typographical" appearance of documents over the Internet and Intranets. In this paper an indexing model of HTML documents is proposed. In this model the index term weight is computed by weighting the term occurrences differently, according to the tags in which they appear.