Automatically constructing a directory of molecular biology databases

Authors:
Luciano Barbosa;Sumit Tandon;Juliana Freire
Affiliations:
School of Computing, University of Utah;School of Computing, University of Utah;School of Computing, University of Utah
Venue:
DILS'07 Proceedings of the 4th international conference on Data integration in the life sciences
Year:
2007

Citing 12
Cited 5

The connectivity server: fast access to linkage information on the Web

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Evaluating topic-driven web crawlers

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning

Machine Learning
Focused Crawling Using Context Graphs

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Crawling the Hidden Web

Proceedings of the 27th International Conference on Very Large Data Bases
An interactive clustering-based approach to integrating source query interfaces on the deep Web

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces

World Wide Web
Data management projects at Google

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Combining classifiers to identify online databases

Proceedings of the 16th international conference on World Wide Web
An adaptive crawler for locating hidden-Web entry points

Proceedings of the 16th international conference on World Wide Web

ProtocolDB: classifying resources with a domain ontology to support discovery

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
BiOnMap: a deductive approach for resource discovery

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Resource descriptions, ontology, and resource discovery

International Journal of Metadata, Semantics and Ontologies
Understanding deep web search interfaces: a survey

ACM SIGMOD Record
A provenance-based approach to resource discovery in distributed molecular dynamics workflows

RED'09 Proceedings of the 2nd international conference on Resource discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

There has been an explosion in the volume of biology-related information that is available in online databases. But finding the right information can be challenging. Not only is this information spread over multiple sources, but often, it is hidden behind form interfaces of online databases. There are several ongoing efforts that aim to simplify the process of finding, integrating and exploring these data. However, existing approaches are not scalable, and require substantial manual input. Notable examples include the NCBI databases and the NAR database compilation. As an important step towards a scalable solution to this problem, we describe a new infrastructure that automates, to a large extent, the process of locating and organizing online databases. We show how this infrastructure can be used to automate the construction and maintenance of a Molecular Biology database collection. We also provide an evaluation which shows that the infrastructure is scalable and effective--it is able to efficiently locate and accurately identify the relevant online databases.