Finding scientific papers with homepagesearch and MOPS

Authors:
Gerd Hoff;Martin Mundhenk
Affiliations:
Universität Trier, Trier, Germany;Friedrich-Schiller-Universität Jena, Jena, Germany
Venue:
SIGDOC '01 Proceedings of the 19th annual international conference on Computer documentation
Year:
2001

Citing 7
Cited 7

Dynamic reference sifting: a case study in the homepage domain

Selected papers from the sixth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Machine Learning

Machine Learning
Modern Information Retrieval

Modern Information Retrieval
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Digital Libraries and Autonomous Citation Indexing

Computer
Using Reinforcement Learning to Spider the Web Efficiently

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning

Early experiences with a 3D model search engine

Web3D '03 Proceedings of the eighth international conference on 3D Web technology
What's there and what's not?: focused crawling for missing documents in digital libraries

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Print-n-link: weaving the paper web

Proceedings of the 2006 ACM symposium on Document engineering
SlideSeer: a digital library of aligned document and presentation pairs

Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Finding what is missing from a digital library: A case study in the Computer Science field

Information Processing and Management: an International Journal
PaMS: A component-based service for finding the missing full text of articles cataloged in a digital library

Information Systems
PaSE: locating online copy of scientific documents effectively

ICADL'04 Proceedings of the 7th international Conference on Digital Libraries: international collaboration and cross-fertilization

Quantified Score

Hi-index	0.00

Visualization

Abstract

The fast dissemination of new research results on the world-wide web poses new challenges for search engines. In this paper we describe a new approach to seek scientific papers relevant to a pre-defined research area. Different from other approaches, we do not search for web pages which contain certain keywords, but we search for web pages which are created by scientists who are active in the research area under consideration. The names of these scientists are obtained from the DBLP server [9]. The HomePageSearch system finds the Home Pages according to the names, and Mops finds research papers close to the Home Pages. It creates an index of these papers and makes it accessible on the web. We conclude that such a focused crawling is very effective for building high-quality collections and indices of scientific papers, using ordinary desktop hardware.