Extensions to Query Languages for Graph Traversal Problems
IEEE Transactions on Knowledge and Data Engineering
PiQA: an algebra for querying protein data sets
SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
A query language for biological networks
Bioinformatics
Graph data management for molecular and cell biology
IBM Journal of Research and Development - Systems biology
Periscope/SQ: interactive exploration of biological sequence databases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Generating GO Slim Using Relational Database Management Systems to Support Proteomics Analysis
CBMS '08 Proceedings of the 2008 21st IEEE International Symposium on Computer-Based Medical Systems
Managing Biological Data using bdbms
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Evaluating Reachability Queries over Path Collections
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Bioinformatics
Hi-index | 0.00 |
Recent developments in high-throughput proteomics technologies have made it possible to detect and identify low abundance proteins. These technologies provide a new window through which proteomes can be analyzed. Despite holding great promise, the contribution of mass spectrometry based proteomics in identifying novel diagnostic biomarkers has been disappointing. This failure has, in part, been attributed to the lack of effective strategies for determining candidate biomarkers that justify more expensive and time-consuming validation studies. An approach that bridges the gap between unbiased experimental paradigm emphasizing comprehensive characterizations of proteins and a candidate-driven paradigm would overcome this limitation [38]. To this end, we have developed database operators that extend the database management systems to analyze high-throughput proteomics and genomics data. By analyzing differentially expressed genes and proteins using pathway databases, these operators take advantage of established expert domain knowledge in pathway annotation to prioritize candidate biomarkers. They provide a systematic way of bridging the gap between unbiased experimental paradigm and candidate-driven paradigm. To test the operators, we analyzed a dataset of salivary proteins differentially expressed between pre-malignant and malignant oral lesions. Six proteins are identified as candidate biomarkers worth of validation studies. A literature search reveals these high priorit candidate biomarkers interact with proteins implicated in cancer development highlighting their potential utility as biomarkers demonstrating the effectiveness of our operators. The developed operators will help overcome one of the main challenges of high-throughput computational techniques; provide a systematic way of bridging the gap between unbiased data driven approach and hypothesis driven approach to prioritize candidate biomarkers worth of more expensive and time consuming validation studies.