Structural similarity mining in semi-structured microarray data for efficient storage construction

Authors:
Jongil Jeong;Dongil Shin;Chulho Cho;Dongkyoo Shin
Affiliations:
Department of Computer Science and Engineering, Sejong University, Seoul, Korea;Department of Computer Science and Engineering, Sejong University, Seoul, Korea;College of Business Administration, Kyung Hee University, Seoul, Korea;Department of Computer Science and Engineering, Sejong University, Seoul, Korea
Venue:
OTM'06 Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part I
Year:
2006

Citing 8
Cited 1

Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Tamino - A DBMS designed for XML

Proceedings of the 17th International Conference on Data Engineering
Schema Mining: Finding Structural Regularity among Semistructured Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Relational Databases for Querying XML Documents: Limitations and Opportunities

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Storing and Querying XML Data in Object-Relational DBMSs

EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
XCpaqs: Compression of XML Document with XPath Query Support

ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
The ArrayExpress gene expression database: a software engineering and implementation perspective

Bioinformatics

XML data clustering: An overview

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many researches related to storing XML data have been performed and some of them proposed methods to improve the performance of databases by reducing the joins between tables Those methods are very efficient in deriving and optimizing tables from a DTD or XML schema in which elements and attributes are defined Nevertheless, those methods are not effective in an XML schema for biological information such as microarray data because even though microarray data have complex hierarchies just a few core values of microarray data repeatedly appear in the hierarchies In this paper, we propose a new algorithm to extract core features which is repeatedly occurs in an XML schema for biological information, and elucidate how to improve classification speed and efficiency by using a decision tree rather than pattern matching in classifying structural similarities We designed a database for storing biological information using features extracted by our algorithm By experimentation, we showed that the proposed classification algorithm also reduced the number of joins between tables.