BioExtract Server—An Integrated Workflow-Enabling System to Access and Analyze Heterogeneous, Distributed Biomolecular Data

Authors:
Carol Lushbough;Michael K. Bergman;Carolyn J. Lawrence;Doug Jennewein;Volker Brendel
Affiliations:
University of South Dakota, Vermillion;VisualMetrics Corporation, Coralville;USDA-ARS Iowa State University, Ames;University of South Dakota, Vermillion;Iowa State University, Ames
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2010

Citing 17
Cited 1

Federated database systems for managing distributed, heterogeneous, and autonomous databases

ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
A Taxonomy and Current Issues in Multidatabase Systems

Computer
The string B-tree: a new data structure for string search in external memory and its applications

Journal of the ACM (JACM)
Burst tries: a fast, efficient data structure for string keys

ACM Transactions on Information Systems (TOIS)
The Conceptual Basis for Mediation Services

IEEE Expert: Intelligent Systems and Their Applications
Query Processing in the TAMBIS Bioinformatics Source Integration System

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
DiscoveryLink: a system for integrated access to life sciences data sources

IBM Systems Journal - Deep computing for the life sciences
K2/Kleisli and GUS: experiments in integrated access to genomic data sources

IBM Systems Journal - Deep computing for the life sciences
Integration of biological sources: current systems and challenges ahead

ACM SIGMOD Record
A taxonomy of scientific workflow systems for grid computing

ACM SIGMOD Record
Light-weight integration of molecular biological databases

Bioinformatics
Scientific workflow management and the Kepler system: Research Articles

Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Compressed full-text indexes

ACM Computing Surveys (CSUR)
HAT-trie: a cache-conscious trie-based data structure for strings

ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
Comparing Compressed Sequences for Faster Nucleotide BLAST Searches

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Composing Different Models of Computation in Kepler and Ptolemy II

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Scientific workflow: a survey and research directions

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics

BioTRON: a biological workflow management system

Proceedings of the 2011 ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many in silico investigations in bioinformatics require access to multiple, distributed data sources and analytic tools. The requisite data sources may include large public data repositories, community databases, and project databases for use in domain-specific research. Different data sources frequently utilize distinct query languages and return results in unique formats, and therefore researchers must either rely upon a small number of primary data sources or become familiar with multiple query languages and formats. Similarly, the associated analytic tools often require specific input formats and produce unique outputs which make it difficult to utilize the output from one tool as input to another. The BioExtract Server (http://bioextract.org) is a Web-based data integration application designed to consolidate, analyze, and serve data from heterogeneous biomolecular databases in the form of a mash-up. The basic operations of the BioExtract Server allow researchers, via their Web browsers, to specify data sources, flexibly query data sources, apply analytic tools, download result sets, and store query results for later reuse. As a researcher works with the system, their “steps” are saved in the background. At any time, these steps can be preserved long-term as a workflow simply by providing a workflow name and description.