Microarrays for an Integrative Genomics
Microarrays for an Integrative Genomics
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
DTDs versus XML schema: a practical study
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Making quality count in biological data sources
Proceedings of the 2nd international workshop on Information quality in information systems
Scientific data management in the coming decade
ACM SIGMOD Record
Visualizing structural patterns in web collections
Proceedings of the 16th international conference on World Wide Web
Managing information quality in e-science: the qurator workbench
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Overview and Framework for Data and Information Quality Research
Journal of Data and Information Quality (JDIQ)
Methodologies for data quality assessment and improvement
ACM Computing Surveys (CSUR)
AxPRE Summaries: Exploring the (Semi-)Structure of XML Web Collections
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
DescribeX: Interacting with AxPRE Summaries
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Hi-index | 0.00 |
The functional genomics and informatics community has made extensive microarray experimental data available online, facilitating independent evaluation of experiment conclusions and enabling researchers to access and reuse a growing body of gene expression knowledge. While there are several data-exchange standards, numerous microarray experiment datasets are published using the MAGE-ML XML schema. Assessing the quality of published experiments is a challenging task, and there is no consensus among microarray users on a framework to measure dataset quality. In this paper, we develop techniques based on DescribeX (a summary-based visualization tool for XML) that quantitatively and qualitatively analyze MAGE-ML public collections, gaining insights about schema usage. We address specific questions such as detection of common instance patterns and coverage, precision of the experiment descriptions, and usage of controlled vocabularies. Our case study shows that DescribeX is a useful tool for the evaluation of microarray experiment data quality that enhances the understanding of the instance-level structure of MAGE-ML datasets.