Intelligent text processing techniques for textual-profile gene characterization

Authors:
Floriana Esposito;Marenglen Biba;Stefano Ferilli
Affiliations:
Department of Computer Science, University of Bari, Bari, Italy;Department of Computer Science, University of New York Tirana, Tirana, Albania;Department of Computer Science, University of Bari, Bari, Italy
Venue:
CIBB'09 Proceedings of the 6th international conference on Computational intelligence methods for bioinformatics and biostatistics
Year:
2009

Citing 7
Cited 0

Inductive Logic Programming: Techniques and Applications

Inductive Logic Programming: Techniques and Applications
Information Retrieval Meets Gene Analysis

IEEE Intelligent Systems
The role of domain information in Word Sense Disambiguation

Natural Language Engineering
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
A computational system to select candidate genes for complex human traits

Bioinformatics
A General Similarity Framework for Horn Clause Logic

Fundamenta Informaticae
The similarity metric

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a suite of Machine Learning and knowledge-based components for textual-profile based gene prioritization. Most genetic diseases are characterized by many potential candidate genes that can cause the disease. Gene expression analysis typically produces a large number of co-expressed genes that could be potentially responsible for a given disease. Extracting prior knowledge from text-based genomic information sources is essential in order to reduce the list of potential candidate genes to be then further analyzed in laboratory. In this paper we present a suite of Machine Learning algorithms and knowledge-based components for improving the computational gene prioritization process. The suite includes basic Natural Language Processing capabilities, advanced text classification and clustering algorithms, robust information extraction components based on qualitative and quantitative keyword extraction methods and exploitation of lexical knowledge bases for semantic text processing.