Exploratory mining and pruning optimizations of constrained associations rules
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Mining Frequent Item Sets with Convertible Constraints
Proceedings of the 17th International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
Multi-way relation classification: application to protein-protein interactions
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
RelEx---Relation extraction using dependency parse trees
Bioinformatics
Detecting Protein-Protein Interactions in Biomedical Texts Using a Parser and Linguistic Resources
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Extracting trees of quantitative serial episodes
KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Recursive sequence mining to discover named entity relations
IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
Discovering linguistic patterns using sequence mining
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Hi-index | 0.00 |
In this paper, we present a method to automatically detect and characterise interactions between genes in biomedical literature. Our approach is based on a combination of data mining techniques: frequent sequential patterns filtered by linguistic constraints and recursive mining. Unlike most Natural Language Processing (NLP) approaches, our approach does not use syntactic parsing to learn and apply linguistic rules. It does not require any resource except the training corpus to learn patterns. The process is in two steps. First, frequent sequential patterns are extracted from the training corpus. Second, after validation of those patterns, they are applied on the application corpus to detect and characterise new interactions. An advantage of our method is that interactions can be enhanced with modalities and biological information. We use two corpora containing only sentences with gene interactions as training corpus. Another corpus from PubMed abstracts is used as application corpus. We conduct an evaluation that shows that the precision of our approach is good and the recall correct for both targets: interaction detection and interaction characterisation.