Communications of the ACM
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Intermediaries: new places for producing and manipulating Web content
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Intermediaries: an approach to manipulating information streams
IBM Systems Journal
Intermediaries personalize information streams
Communications of the ACM
Towards adaptive Web sites: conceptual framework and case study
Artificial Intelligence - Special issue on Intelligent internet systems
On proxy agents, mobility, and web access
Mobile Networks and Applications
The Ninja architecture for robust Internet-scale systems and services373423
Computer Networks: The International Journal of Computer and Telecommunications Networking - pervasive computing
Mercator: A scalable, extensible Web crawler
World Wide Web
Lessons from Giant-Scale Services
IEEE Internet Computing
IEEE Concurrency
Design and Implementation of a Distributed Crawler and Filtering Processor
NGITS '02 Proceedings of the 5th International Workshop on Next Generation Information Technologies and Systems
Performance Evaluation of Mobile-Agent Middleware: A Hierarchical Approach
MA '01 Proceedings of the 5th International Conference on Mobile Agents
iMobile EE: an enterprise mobile service platform
Wireless Networks
Mobile Agent Platforms for Web Databases: A Qualitative and Quantitative Assessment
ASAMA '99 Proceedings of the First International Symposium on Agent Systems and Applications Third International Symposium on Mobile Agents
Intermediary infrastructures for the world wide web
Computer Networks: The International Journal of Computer and Telecommunications Networking
Advanced service provisioning based on mobile agents
Computer Communications
Web robot detection: A probabilistic reasoning approach
Computer Networks: The International Journal of Computer and Telecommunications Networking
An investigation of web crawler behavior: characterization and metrics
Computer Communications
Service-Oriented data and process models for personalization and collaboration in e-business
EC-Web'06 Proceedings of the 7th international conference on E-Commerce and Web Technologies
Hi-index | 0.24 |
In this paper, we present an overview of extensible Retrieval, Annotation and Caching Engine (eRACE), a modular and distributed intermediary infrastructure that collects information from heterogeneous Internet sources according to registered profiles or end-user requests. Collected information is stored for filtering, transformation, aggregation, and subsequent personalized or wide-area dissemination on the wireline or wireless-Internet. We study the architecture and implementation of the main module of eRACE, an HTTP proxy named WebRACE. WebRACE consists of a high-performance, distributed and multithreaded Web crawler, a multithreaded filtering processor and an Object Cache. We discuss the implementation of WebRACE in Java, describe a number of performance optimizations, and present its performance assessment.