Sequential patterns to discover and characterise biological relations

Authors:
Peggy Cellier;Thierry Charnois;Marc Plantevit
Affiliations:
Université de Caen, CNRS Université de Caen, GREYC, UMR6072, France;Université de Caen, CNRS Université de Caen, GREYC, UMR6072, France;Université de Lyon, CNRS Université de Lyon 1, LIRIS, UMR5205, France
Venue:
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Year:
2010

Citing 11
Cited 2

Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
SPADE: an efficient algorithm for mining frequent sequences

Machine Learning
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Mining Frequent Item Sets with Convertible Constraints

Proceedings of the 17th International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Discovering patterns to extract protein–protein interactions from the literature: Part II

Bioinformatics
Multi-way relation classification: application to protein-protein interactions

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
RelEx---Relation extraction using dependency parse trees

Bioinformatics
Detecting Protein-Protein Interactions in Biomedical Texts Using a Parser and Linguistic Resources

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Extracting trees of quantitative serial episodes

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases

Recursive sequence mining to discover named entity relations

IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
Discovering linguistic patterns using sequence mining

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a method to automatically detect and characterise interactions between genes in biomedical literature. Our approach is based on a combination of data mining techniques: frequent sequential patterns filtered by linguistic constraints and recursive mining. Unlike most Natural Language Processing (NLP) approaches, our approach does not use syntactic parsing to learn and apply linguistic rules. It does not require any resource except the training corpus to learn patterns. The process is in two steps. First, frequent sequential patterns are extracted from the training corpus. Second, after validation of those patterns, they are applied on the application corpus to detect and characterise new interactions. An advantage of our method is that interactions can be enhanced with modalities and biological information. We use two corpora containing only sentences with gene interactions as training corpus. Another corpus from PubMed abstracts is used as application corpus. We conduct an evaluation that shows that the precision of our approach is good and the recall correct for both targets: interaction detection and interaction characterisation.