Practical algorithms for pattern based linear regression

Authors:
Hideo Bannai;Kohei Hatano;Shunsuke Inenaga;Masayuki Takeda
Affiliations:
Department of Informatics, Kyushu University, Fukuoka, Japan;Department of Informatics, Kyushu University, Fukuoka, Japan;Department of Informatics, Kyushu University, Fukuoka, Japan;Department of Informatics, Kyushu University, Fukuoka, Japan
Venue:
DS'05 Proceedings of the 8th international conference on Discovery Science
Year:
2005

Citing 9
Cited 1

Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
Polynomial-time learning of elementary formal systems

New Generation Computing
Discovering Best Variable-Length-Don't-Care Patterns

DS '02 Proceedings of the 5th International Conference on Discovery Science
Finding Best Patterns Practically

Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications

CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Color Set Size Problem with Application to String Matching

CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
A Practical Algorithm to Find the Best Episode Patterns

DS '01 Proceedings of the 4th International Conference on Discovery Science
A practical algorithm to find the best subsequence patterns

Theoretical Computer Science
An O(N^2) Algorithm for Discovering Optimal Boolean Pattern Pairs

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Algorithms for String Pattern Discovery

MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of discovering the optimal pattern from a set of strings and associated numeric attribute values. The goodness of a pattern is measured by the correlation between the number of occurrences of the pattern in each string, and the numeric attribute value assigned to the string. We present two algorithms based on suffix trees, that can find the optimal substring pattern in O(Nn) and O(N2) time, respectively, where n is the number of strings and N is their total length. We further present a general branch and bound strategy that can be used when considering more complex pattern classes. We also show that combining the O(N2) algorithm and the branch and bound heuristic increases the efficiency of the algorithm considerably.