Self Generating Metaheuristics in Bioinformatics: The Proteins Structure Comparison Case
Genetic Programming and Evolvable Machines
Information distance from a question to an answer
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Aligning sequences by minimum description length
EURASIP Journal on Bioinformatics and Systems Biology
Information shared by many objects
Proceedings of the 17th ACM conference on Information and knowledge management
A framework for developing optimization-based decision support systems
Expert Systems with Applications: An International Journal
New information distance measure and its application in question answering system
Journal of Computer Science and Technology
Protein Comparison by the Alignment of Fuzzy Energy Signatures
RSKT '09 Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology
Evaluating Protein Similarity from Coarse Structures
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A new dot plot-based algorithm for genomes sequences comparison: A preliminary study
Expert Systems with Applications: An International Journal
A fuzzy sets based generalization of contact maps for the overlap of protein structures
Fuzzy Sets and Systems
Designing a methodology to estimate complexity of protein structures
ECAL'07 Proceedings of the 9th European conference on Advances in artificial life
Protein structure alignment using maximum cliques and local search
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Information distance and its extensions
DS'11 Proceedings of the 14th international conference on Discovery science
Protein structure comparison based on a measure of information discrepancy
TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
Information distance and its applications
CIAA'06 Proceedings of the 11th international conference on Implementation and Application of Automata
Impugning Randomness, Convincingly
Studia Logica
Classifying stem cell differentiation images by information distance
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Community evolution detection in time-evolving information networks
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Towards UCI+: A mindful repository design
Information Sciences: an International Journal
Hi-index | 3.84 |
Motivation: As an increasing number of protein structures become available, the need for algorithms that can quantify the similarity between protein structures increases as well. Thus, the comparison of proteins' structures, and their clustering accordingly to a given similarity measure, is at the core of today's biomedical research. In this paper, we show how an algorithmic information theory inspired Universal Similarity Metric (USM) can be used to calculate similarities between protein pairs. The method, besides being theoretically supported, is surprisingly simple to implement and computationally efficient. Results: Structural similarity between proteins in four different datasets was measured using the USM. The sample employed represented alpha, beta, alpha--beta, tim--barrel, globins and serpine protein types. The use of the proposed metric allows for a correct measurement of similarity and classification of the proteins in the four datasets. Availability: All the scripts and programs used for the preparation of this paper are available at http://www.cs.nott.ac.uk/~nxk/USM/protocol.html. In that web-page the reader will find a brief description on how to use the various scripts and programs. Supplementary information: The protein datasets used are collected in http://www.cs.nott.ac.uk/~nxk/USM/datasets.html. The calculated similarity values for the proteins used in this paper can be found in http://www.cs.nott.ac.uk/~nxk/USM/similar.html. The clustering of the dataset based on these similarity values can be found in http://www.cs.nott.ac.uk/~nxk/USM/clustering.html