WebFilter: A High-throughput XML-based Publish and Subscribe System
Proceedings of the 27th International Conference on Very Large Data Bases
ICSM '01 Proceedings of the IEEE International Conference on Software Maintenance (ICSM'01)
Lucene in Action (In Action series)
Lucene in Action (In Action series)
WebGuard: A Web Filtering Engine Combining Textual, Structural, and Visual Content-Based Analysis
IEEE Transactions on Knowledge and Data Engineering
Electronic Notes in Theoretical Computer Science (ENTCS)
IEEE Transactions on Software Engineering
Hi-index | 0.00 |
Nowadays, Internet is the main source of information for millions of people and enterprises. However, the information in Internet has not been classified yet and, consequently, the search for information is one of the most important tasks and processes performed by users and systems. In particular, for WWW human users the search for information is the main (time-consuming) task performed. In order to face this problem both the industrial and the academic communities have developed many methods and tools to index and search Web pages. The most extended solution is the use of search engines such as Google and Yahoo; however, while current search engines can be a suitable solution to find a particular Web page, they are useless to find the relevant information in such a page. Hence, once a Web page is found, the user must search on it in order to verify if the information needed is in there. This is a problem which until now has not been satisfactorily solved. In this paper we present a tool able to automatically extract from a Web page the information (text, images, etc.) related to a filtering criterion without the use of semantic specifications or indexes and without the need of offline parsing or compilation processes. This tool has been published as an extension for the Firefox's Web navigator.