Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Processing XML Streams with Deterministic Automata
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Efficient Filtering of XML Documents for Selective Dissemination of Information
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
YFilter: Efficient and Scalable Filtering of XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
BioFast: challenges in exploring linked life sciences sources
ACM SIGMOD Record
XML for Bioinformatics
Query planning in the presence of overlapping sources
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Optimizing monitoring queries over distributed data
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
BioScout: a life-science query monitoring system
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Hi-index | 0.00 |
Life science researchers want biological information in their interest to become available to them as soon as possible. A monitoring system is a solution that relieves biologists from periodic exploration of databases. In particular, it allows them to express their interest in certain data by means of queries/constraints; they are then notified when new data arrives satisfying these queries/constraints. We describe a sequence monitoring system XSeqM where users can combine metadata queries on sequence records with constraints on an alignment against a given source sequence. The system is an XML-based solution where constraints are specified through search fields in a user-friendly web interface and which are then translated to corresponding XPath-expressions. The system is easily extensible as addition of new databases to the system then only amounts to the specification of new mappings from search fields to XPath-expressions. To protect private source sequences obtained in labs, it is imperative that researchers do not have to upload their sequences to a general untrusted system, but that they can run XSeqM locally. To keep the system light-weight, we therefore introduce an optimization technique based on query containment to reduce the number of XPath-evaluations which constitutes the bottleneck of the system. We experimentally validate this technique and show that it can drastically improve the running time.