Electronic Publishing—Origination, Dissemination, and Design - Information retrieval
Mercator: A scalable, extensible Web crawler
World Wide Web
UbiCrawler: a scalable fully distributed web crawler
Software—Practice & Experience
Newsmap: a knowledge map for online news
Decision Support Systems - Special issue: Collaborative work and knowledge management
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
News Filtering and Summarization on the Web
IEEE Intelligent Systems
Hi-index | 0.00 |
Enormous amount of news articles are added and updated on the Internet round-the-clock. This requires frequent and intensive processing by the news retrieval system. The news retrieval systems in use today, barely meet this requirement. Cloudpress 2.0 presented in this paper, is designed and implemented to be scalable, robust and fault tolerant. It is designed to exploit MapReduce paradigm for fetching, processing, organizing and summarizing all the news articles and to use the power of the Cloud computing. Furthermore, it uses novel approaches for parallel processing, for storing the news articles in a distributed database and for visualizing them as a 3D visual. It also includes a novel query expansion feature for searching the news articles. Cloudpress 2.0 also allows on-the-fly, extractive summarization of news articles based on the input query.