An overview of the object protocol model (OPM) and the OPM data management tools
Information Systems - Special issue: databases: their creation, management and utilization
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A hierarchical approach to wrapper induction
Proceedings of the third annual conference on Autonomous Agents
Generating finite-state transducers for semi-structured data extraction from the Web
Information Systems - Special issue on semistructured data
Computational aspects of resilient data extraction from semistructured sources (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bioinformatics Adventures in Database Research
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Relational Databases for Querying XML Documents: Limitations and Opportunities
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Integrating life sciences data-with a little Garlic
BIBE '00 Proceedings of the 1st IEEE International Symposium on Bioinformatics and Biomedical Engineering
BioDIFF: an effective fast change detection algorithm for genomic and proteomic data
Proceedings of the thirteenth ACM international conference on Information and knowledge management
BioDIFF: an effective fast change detection algorithm for biological annotations
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
BIDEL: an XML-based system for effective fast change detection of genomic and proteomic data
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Hi-index | 0.00 |
Data integration of geographically dispersed, heterogeneous, complex biological databases is a key research area. One of the key features of a successful data integration system is to have a simple self-describing data exchange format. However, many of the biological databases provide data in flat files which are poor data exchange formats. Fortunately, XML can be viewed as a powerful data model and better data exchange format. In this paper, we present the Bio2X system that transforms flat file data into highly hierarchical XML data using rule-based machine learning technique. Bio2X has been fully implemented using Java. Our experiments to transform real world biological data demonstrate the effectiveness of the Bio2X approach.