An Algorithm for Subgraph Isomorphism
Journal of the ACM (JACM)
IEPAD: information extraction based on pattern discovery
Proceedings of the 10th international conference on World Wide Web
A case for parameterized views and relational unification
Proceedings of the 2001 ACM symposium on Applied computing
Algorithms on Trees and Graphs
Algorithms on Trees and Graphs
XClust: clustering XML schemas for effective integration
Proceedings of the eleventh international conference on Information and knowledge management
DEByE - Date extraction by example
Data & Knowledge Engineering
Information Source Tracking Method: Efficiency Issues
IEEE Transactions on Knowledge and Data Engineering
A Parametric Approach to Deductive Databases with Uncertainty
IEEE Transactions on Knowledge and Data Engineering
Generic Schema Matching with Cupid
Proceedings of the 27th International Conference on Very Large Data Bases
SchemaSQL - A Language for Interoperability in Relational Multi-Database Systems
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient Record Linkage in Large Data Sets
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
On a theory of probabilistic deductive databases
Theory and Practice of Logic Programming
Kepler: An Extensible System for Design and Execution of Scientific Workflows
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Triana: A Graphical Web Service Composition and Execution Toolkit
ICWS '04 Proceedings of the IEEE International Conference on Web Services
XML programming with SQL/XML and XQuery
IBM Systems Journal
Towards an Industrial Strength SQL/XML Infrastructure
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Schema and ontology matching with COMA++
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
MetaQuerier: querying structured web sources on-the-fly
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Automatic ontology matching using application semantics
AI Magazine - Special issue on semantic integration
GORDIAN: efficient and scalable discovery of composite keys
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Pegasus: A framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
Fine-grained access control to web databases
Proceedings of the 12th ACM symposium on Access control models and technologies
XRPC: interoperable and efficient distributed XQuery
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Bioinformatics
An XML Schema integration and query mechanism system
Data & Knowledge Engineering
Graphs-at-a-time: query language and access methods for graph databases
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
GOSAP: Gene Ontology-Based Semantic Alignment of Biological Pathways
International Journal of Bioinformatics Research and Applications
PhyQL: A Web-Based Phylogenetic Visual Query Engine
BIBM '08 Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine
Aggregation of Information Resources on the Invisible Web
WKDD '09 Proceedings of the 2009 Second International Workshop on Knowledge Discovery and Data Mining
Time-completeness trade-offs in record linkage using adaptive query processing
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Bioinformatics
Learning blocking schemes for record linkage
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
Query translation from XPath to SQL in the presence of recursive DTDs
The VLDB Journal — The International Journal on Very Large Data Bases
On-the-Fly Integration and Ad Hoc Querying of Life Sciences Databases Using LifeDB
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Post processing wrapper generated tables for labeling anonymous datasets
Proceedings of the eleventh international workshop on Web information and data management
An Algebraic Language for Semantic Data Integration on the Hidden Web
ICSC '09 Proceedings of the 2009 IEEE International Conference on Semantic Computing
ICTAI '09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence
Ontology guided autonomous label assignment in wrapper induced tables with missing column names
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Schema mapping and query translation in heterogeneous P2P XML databases
The VLDB Journal — The International Journal on Very Large Data Bases
A stochastic approach to candidate disease gene subnetwork extraction
Proceedings of the 2010 ACM Symposium on Applied Computing
Wikipedia driven autonomous label assignment in wrapper induced tables with missing column names
Proceedings of the 2010 ACM Symposium on Applied Computing
Computing subgraph isomorphic queries using structural unification and minimum graph structures
Proceedings of the 2011 ACM Symposium on Applied Computing
Information aggregation using the caméléon# web wrapper
EC-Web'05 Proceedings of the 6th international conference on E-Commerce and Web Technologies
Query transformation of SQL into XQuery within federated environments
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
WSM: a novel algorithm for subgraph matching in large weighted graphs
Journal of Intelligent Information Systems
Hi-index | 0.00 |
Researchers in Systems Biology routinely access vast collection of hidden web research resources freely available on the internet. These collections include online data repositories, online and downloadable data analysis tools, publications, text mining systems, visualization artifacts, etc. Almost always, these resources have complex data formats that are heterogeneous in representation, data type, interpretation and even identity. They are often forced to develop analysis pipelines and data management applications that involve extensive and prohibitive manual interactions. Such approaches act as a barrier for optimal use of these resources and thus impede the progress of research. In this paper, we discuss our experience of building a new middleware approach to data and application integration for Systems Biology that leverages recent developments in schema matching, wrapper generation, workflow management, and query language design. In this approach, ad hoc integration of arbitrary resources and computational pipeline construction using a declarative language is advocated. We highlight the features and advantages of this new data management system, called LifeDB, and its query language BioFlow. Based on our experience, we highlight the new challenges it raises, and potential solutions to meet these new research issues toward a viable platform for large scale autonomous data integration. We believe the research issues we raise have general interest in the autonomous data integration community and will be applicable equally to research unrelated to LifeDB.