Web document modeling

Authors:
Alessandro Micarelli;Filippo Sciarrone;Mauro Marinilli
Affiliations:
Department of Computer Science and Automation Artificial Intelligence Laboratory, Roma Tre University, Rome, Italy;Department of Computer Science and Automation Artificial Intelligence Laboratory, Roma Tre University, Rome, Italy;Department of Computer Science and Automation Artificial Intelligence Laboratory, Roma Tre University, Rome, Italy
Venue:
The adaptive web
Year:
2007

Citing 62
Cited 12

Fusion, propagation, and structuring in belief networks

Artificial Intelligence
Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
Using the cosine measure in a neural network for document retrieval

SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation of an inference network-based retrieval model

ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
Hypertext and information retrieval: what are the fundamental concepts?

Hypertext: concepts, systems and applications
Information filtering and information retrieval: two sides of the same coin?

Communications of the ACM - Special issue on information filtering
Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
Retrieval strategies for hypertext

Information Processing and Management: an International Journal - Special issue on hypertext and information retrieval
Making use of hypertext links when retrieving information

ECHT '92 Proceedings of the ACM conference on Hypertext
Applying Bayesian networks to information retrieval

Communications of the ACM
WordNet: a lexical database for English

Communications of the ACM
Document filtering with inference networks

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Learning and Revising User Profiles: The Identification ofInteresting Web Sites

Machine Learning - Special issue on multistrategy learning
Improved algorithms for topic distillation in a hyperlinked environment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
MailCat: an intelligent assistant for organizing e-mail

Proceedings of the third annual conference on Autonomous Agents
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Experimentation as a way of life: Okapi at TREC

Information Processing and Management: an International Journal - The sixth text REtrieval conference (TREC-6)
Does “authority” mean quality? predicting expert quality ratings of Web documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic model of information retrieval: development and comparative experiments Part 2

Information Processing and Management: an International Journal
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
The SGML FAQ Book: Understanding the Foundation of HTML and XML

The SGML FAQ Book: Understanding the Foundation of HTML and XML
Information Retrieval and HyperText

Information Retrieval and HyperText
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Modern Information Retrieval

Modern Information Retrieval
Http: The Definitive Guide

Http: The Definitive Guide
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
User Modeling for Adaptive News Access

User Modeling and User-Adapted Interaction
Mining the Web's Link Structure

Computer
In Search of the Wisdom Web

Computer
Visualization of large category map for internet browsing

Decision Support Systems - Web retrieval and mining
Web Intelligence (WI)

WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
Information Retrieval on the Web

ESSIR '00 Proceedings of the Third European Summer-School on Lectures on Information Retrieval-Revised Lectures
A New Study on Using HTML Structures to Improve Retrieval

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
A neural probabilistic language model

The Journal of Machine Learning Research
A Neural Network Based Approach to Automated E-Mail Classification

WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
An indexing model of HTML documents

Proceedings of the 2003 ACM symposium on Applied computing
User Modelling for News Web Sites with Word Sense Based Techniques

User Modeling and User-Adapted Interaction
Anatomy and Empirical Evaluation of an Adaptive Web-Based Information Filtering System

User Modeling and User-Adapted Interaction
The wisdom web: new challenges for web intelligence (WI)

Journal of Intelligent Information Systems - Special issue on web intelligence
Toward better weighting of anchors

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Combining evidence for Web retrieval using the inference network model: an experimental study

Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
A Bayesian Framework for XML Information Retrieval: Searching and Learning with the INEX Collection

Information Retrieval
Contextual weighted representations and indexing models for the retrieval of HTML documents

Soft Computing - A Fusion of Foundations, Methodologies and Applications
The NLP task at INEX 2004

ACM SIGIR Forum
A novel document retrieval method using the discrete wavelet transform

ACM Transactions on Information Systems (TOIS)
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing
Evolving local and global weighting schemes in information retrieval

Information Retrieval
Using the structure of HTML documents to improve retrieval

USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Combating web spam with trustrank

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
User profiles for personalized information access

The adaptive web
Data mining for web personalization

The adaptive web
Personalized search on the world wide web

The adaptive web
Adaptive focused crawling

The adaptive web
Content-based recommendation systems

The adaptive web
Adaptive 3D web sites

The adaptive web
Adaptive news access

The adaptive web
A neural network for text representation

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II

Item-Based Filtering and Semantic Networks for Personalized Web Content Adaptation in E-Commerce

SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
An ontology-based approach for modelling grid services in the context of e-learning

International Journal of Web and Grid Services
Towards service-oriented recommendation functionalities within pervasive e-learning systems

International Journal of Business Intelligence and Data Mining
User profiles for personalized information access

The adaptive web
Data mining for web personalization

The adaptive web
Personalized search on the world wide web

The adaptive web
Adaptive focused crawling

The adaptive web
Adaptive navigation support

The adaptive web
Open corpus adaptive educational hypermedia

The adaptive web
Semantic web technologies for the adaptive web

The adaptive web
Accuracy of inter-researcher similarity measures based on topical and social clues

Scientometrics
MeSoOnTV: a media and social-driven ontology-based TV knowledge management system

Proceedings of the 24th ACM Conference on Hypertext and Social Media

Quantified Score

Hi-index	0.00

Visualization

Abstract

A very common issue of adaptive Web-Based systems is the modeling of documents. Such documents represent domain-specific information for a number of purposes. Application areas such as Information Search, Focused Crawling and Content Adaptation (among many others) benefit from several techniques and approaches to model documents effectively. For example, a document usually needs preliminary processing in order to obtain the relevant information in an effective and useful format, so as to be automatically processed by the system. The objective of this chapter is to support other chapters, providing a basic overview of the most common and useful techniques and approaches related with document modeling. This chapter describes high-level techniques to model Web documents, such as the Vector Space Model and a number of AI approaches, such as Semantic Networks, Neural Networks and Bayesian Networks. This chapter is not meant to act as a substitute of more comprehensive discussions about the topics presented. Rather, it provides a brief and informal introduction to the main concepts of document modeling, also focusing on the systems that are presented in the rest of the book as concrete examples of the related concepts.