Discovering Characteristic Patterns from Collections of Classical Japanese Poems

Authors:
Mayumi Yamasaki;Masayuki Takeda;Tomoko Fukuda;Ichiro Nanri
Affiliations:
-;-;-;-
Venue:
DS '98 Proceedings of the First International Conference on Discovery Science
Year:
1998

Citing 4
Cited 3

Mining in the Phrasal Frontier

PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Discovering Unbounded Unions of Regular Pattern Languages from Positive Examples (Extended Abstract)

ISAAC '96 Proceedings of the 7th International Symposium on Algorithms and Computation
Finding Minimal Generalizations for Unions of Pattern Languages and Its Application to Inductive Inference from Positive Data

STACS '94 Proceedings of the 11th Annual Symposium on Theoretical Aspects of Computer Science
Finding patterns common to a set of strings (Extended Abstract)

STOC '79 Proceedings of the eleventh annual ACM symposium on Theory of computing

Mining from Literary Texts: Pattern Discovery and Similarity Computation

Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Discovering Poetic Allusion in Anthologies of Classical Japanese Poems

DS '99 Proceedings of the Second International Conference on Discovery Science
Discovering Characteristic Expressions from Literary Works: A New Text Analysis Method beyond N-Gram Statistics and KWIC

DS '00 Proceedings of the Third International Conference on Discovery Science

Quantified Score

Hi-index	0.01

Visualization

Abstract

WAKA is a form of traditional Japanese poetry with a 1300- year history. In this paper, we attempt to discover characteristics common to a collection of WAKA poems. As a formalism for characteristics, we use regular patterns where the constant parts are limited to sequences of auxiliary verbs and postpositional particles. We call such patterns FUSHI. The problem is to find automatically significant fushi patterns that characterize the poems. Solving this problem requires a reliable significance measure for the patterns. Bräzma et al. (1996) proposed such a measure according to the MDL principle. Using this method, we report successful results in finding patterns from five anthologies. Some of the results are quite stimulating, and we hope that they will lead to new discoveries. Based on our experience, we also propose a pattern-based text data mining system. Further research into WAKA poetry is now proceeding using this system.