Foundations of statistical natural language processing
Foundations of statistical natural language processing
Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Hi-index | 0.00 |
In the era of structural biology, it is necessary to apply efficient and effective tools to compare and align 3D-structure of biomolecules. Although a great number of structural comparison and alignment methods have been developed, none of them gives an exact solution to the problem. In this paper, we introduce a novel method for structural alignment of proteins based on language modelling techniques. In this way, we summarized the protein secondary and tertiary structure in two textual sequences. The first sequence is used to initial superposiotion of secondary structure elements and the second sequence is employed to align the 3D-structure of two compared structure. In order to compare sequences, the method applies a technique inspired from computational linguistics for analysing and comparing textual data. In this strategy, the cross-entropy measure over n-gram models is used to capture regularities between sequences of protein structures. Some experiments were performed in order to compare the performance of the method with the other structure alignment methods. The results of the experiments reported here, provide evidence for the usefulness of the new approach and its preference and applicability comparing with the other related methods.