Adaptive focused crawling

Authors:
Alessandro Micarelli;Fabio Gasparetti
Affiliations:
Department of Computer Science and Automation, Artificial Intelligence Laboratory, Roma Tre University, Rome, Italy;Department of Computer Science and Automation, Artificial Intelligence Laboratory, Roma Tre University, Rome, Italy
Venue:
The adaptive web
Year:
2007

Citing 63
Cited 6

The vocabulary problem in human-system communication

Communications of the ACM
Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Fab: content-based, collaborative recommendation

Communications of the ACM
Effective view navigation

Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
A scalable comparison-shopping agent for the World-Wide Web

AGENTS '97 Proceedings of the first international conference on Autonomous agents
The quest for correct information on the Web: hyper search engines

Selected papers from the sixth international conference on World Wide Web
Personalized, interactive news on the Web

Multimedia Systems
Finding context paths for Web pages

Proceedings of the tenth ACM Conference on Hypertext and hypermedia : returning to our diverse roots: returning to our diverse roots
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient crawling through URL ordering

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The shark-search algorithm. An application: tailored Web site mapping

WWW7 Proceedings of the seventh international conference on World Wide Web 7
A smart itsy bitsy spider for the web

Journal of the American Society for Information Science - Special topic issue: artificial intelligence techniques for emerging information systems applications
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Patterns of search: analyzing and modeling Web query refinement

UM '99 Proceedings of the seventh international conference on User modeling
Synchronizing a database to improve freshness

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Topical locality in the Web

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Recent results in automatic Web resource discovery

ACM Computing Surveys (CSUR)
Graph structure in the Web

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Adaptive Retrieval Agents: Internalizing Local Contextand Scaling up to the Web

Machine Learning - Special issue on information retrieval
DEADLINER: building a new niche search engine

Proceedings of the ninth international conference on Information and knowledge management
Searching the Web: the public and their queries

Journal of the American Society for Information Science and Technology
Using information scent to model user information needs and actions and the Web

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Intelligent crawling on the World Wide Web with arbitrary predicates

Proceedings of the 10th international conference on World Wide Web
Breadth-first crawling yields high-quality pages

Proceedings of the 10th international conference on World Wide Web
Searching the Web

ACM Transactions on Internet Technology (TOIT)
Parallel crawlers

Proceedings of the 11th international conference on World Wide Web
Accelerated focused crawling through online relevance feedback

Proceedings of the 11th international conference on World Wide Web
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Translation of web queries using anchor text mining

ACM Transactions on Asian Language Information Processing (TALIP)
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Topic-oriented collaborative crawling

Proceedings of the eleventh international conference on Information and knowledge management
Exploring Versus Exploiting when Learning User Models for Text Recommendation

User Modeling and User-Adapted Interaction
Mercator: A scalable, extensible Web crawler

World Wide Web
Using Reinforcement Learning to Spider the Web Efficiently

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Distributed Hypertext Resource Discovery Through Examples

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The Evolution of the Web and Implications for an Incremental Crawler

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Focused Crawling Using Context Graphs

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Crawling the Hidden Web

Proceedings of the 27th International Conference on Very Large Data Bases
Focused Crawls, Tunneling, and Digital Libraries

ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
On the bursty evolution of blogspace

WWW '03 Proceedings of the 12th international conference on World Wide Web
A large-scale study of the evolution of web pages

WWW '03 Proceedings of the 12th international conference on World Wide Web
High-performance web crawling

Handbook of massive data sets
Estimating frequency of change

ACM Transactions on Internet Technology (TOIT)
Design and Implementation of a High-Performance Distributed Web Crawler

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Enhanced web document summarization using hyperlinks

Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Ontology-focused crawling of Web documents

Proceedings of the 2003 ACM symposium on Applied computing
Engineering a multi-purpose test collection for web retrieval experiments

Information Processing and Management: an International Journal
What's new on the web?: the evolution of the web from a search engine perspective

Proceedings of the 13th international conference on World Wide Web
Automatic generation of agents for collecting hidden web pages for data extraction

Data & Knowledge Engineering - Special issue: WIDM 2002
Hourly analysis of a very large topically categorized web query log

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Design of a crawler with bounded bandwidth

Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Topical web crawlers: Evaluating adaptive algorithms

ACM Transactions on Internet Technology (TOIT)
SmartCrawl: a new strategy for the exploration of the hidden web

Proceedings of the 6th annual ACM international workshop on Web information and data management
UbiCrawler: a scalable fully distributed web crawler

Software—Practice & Experience
Learnable topic-specific web crawler

Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
A General Evaluation Framework for Topical Crawlers

Information Retrieval
The indexable web is more than 11.5 billion pages

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Lexical and semantic clustering by web links

Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Downloading textual hidden web content through keyword queries

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
User profiles for personalized information access

The adaptive web
Web document modeling

The adaptive web
Personalized search on the world wide web

The adaptive web

Adaptive geospatially focused crawling

Proceedings of the 18th ACM conference on Information and knowledge management
Web document modeling

The adaptive web
Personalized search on the world wide web

The adaptive web
Open corpus adaptive educational hypermedia

The adaptive web
Thanks for the memory: Cooperative autonomous agent search in uncertain environments

Computers in Human Behavior
An extended method for finding related web pages with focused crawling techniques

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

The large amount of available information on the Web makes it hard for users to locate resources about particular topics of interest. Traditional search tools, e.g., search engines, do not always successfully cope with this problem, that is, helping users to seek the right information. In the personalized search domain, focused crawlers are receiving increasing attention, as a well-founded alternative to search theWeb. Unlike a standard crawler, which traverses theWeb downloading all the documents it comes across, a focused crawler is developed to retrieve documents related to a given topic of interest, reducing the network and computational resources. This chapter presents an overview of the focused crawling domain and, in particular, of the approaches that include a sort of adaptivity. That feature makes it possible to change the system behavior according to the particular environment and its relationships with the given input parameters during the search.