Fragmentary Pattern Matching: Complexity, Algorithms and Applications for Analyzing Classic Literary Works

Authors:
Hideaki Hori;Shinichi Shimozono;Masayuki Takeda;Ayumi Shinohara
Affiliations:
Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, Iizuka, Japan 820-8502;Department of Artificial Intelligence, Kyushu Institute of Technology, Iizuka, Japan 820-8502;Department of Informatics, Kyushu University 33, Fukuoka, Japan 812-8581 and PRESTO, Japan Science and Technology Corporation, Japan;Department of Informatics, Kyushu University 33, Fukuoka, Japan 812-8581
Venue:
ISAAC '01 Proceedings of the 12th International Symposium on Algorithms and Computation
Year:
2009

Citing 0
Cited 4

Mining from Literary Texts: Pattern Discovery and Similarity Computation

Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Discovering Repetitive Expressions and Affinities from Anthologies of Classical Japanese Poems

DS '01 Proceedings of the 4th International Conference on Discovery Science
Identifying Quotations in Reference Works and Primary Materials

ECDL '08 Proceedings of the 12th European conference on Research and Advanced Technology for Digital Libraries
On shortest common superstring and swap permutations

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

A fragmentary pattern is a multiset of non-empty strings, and it matches a string w if all the strings in it occur within w without any overlaps. We study some fundamental issues on computational complexity related to the matching of fragmentary patterns. We show that the fragmentary pattern matching problem is NP-complete, and the problem to find a fragmentary pattern common to two strings that maximizes the pattern score is NP-hard. Moreover, we propose a polynomialtime approximation algorithm for the fragmentary pattern matching, and show that it achieves a constant worst-case approximation ratio if either the strings in a pattern have the same length, or the importance weights of strings in a pattern are proportional to their lengths.