Dynamic reference sifting: a case study in the homepage domain
Selected papers from the sixth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Machine Learning
Modern Information Retrieval
Automating the Construction of Internet Portals with Machine Learning
Information Retrieval
Using Reinforcement Learning to Spider the Web Efficiently
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Early experiences with a 3D model search engine
Web3D '03 Proceedings of the eighth international conference on 3D Web technology
What's there and what's not?: focused crawling for missing documents in digital libraries
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Print-n-link: weaving the paper web
Proceedings of the 2006 ACM symposium on Document engineering
SlideSeer: a digital library of aligned document and presentation pairs
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Finding what is missing from a digital library: A case study in the Computer Science field
Information Processing and Management: an International Journal
PaSE: locating online copy of scientific documents effectively
ICADL'04 Proceedings of the 7th international Conference on Digital Libraries: international collaboration and cross-fertilization
Hi-index | 0.00 |
The fast dissemination of new research results on the world-wide web poses new challenges for search engines. In this paper we describe a new approach to seek scientific papers relevant to a pre-defined research area. Different from other approaches, we do not search for web pages which contain certain keywords, but we search for web pages which are created by scientists who are active in the research area under consideration. The names of these scientists are obtained from the DBLP server [9]. The HomePageSearch system finds the Home Pages according to the names, and Mops finds research papers close to the Home Pages. It creates an index of these papers and makes it accessible on the web. We conclude that such a focused crawling is very effective for building high-quality collections and indices of scientific papers, using ordinary desktop hardware.