Discovering linguistic patterns using sequence mining

  • Authors:
  • Nicolas Béchet;Peggy Cellier;Thierry Charnois;Bruno Crémilleux

  • Affiliations:
  • GREYC Université de Caen Basse-Normandie, Caen CEDEX, France;INSA Rennes/IRISA, Rennes cedex, France;GREYC Université de Caen Basse-Normandie, Caen CEDEX, France;GREYC Université de Caen Basse-Normandie, Caen CEDEX, France

  • Venue:
  • CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a method based on data mining techniques to automatically discover linguistic patterns matching appositive qualifying phrases. We develop an algorithm mining sequential patterns made of itemsets with gap and linguistic constraints. The itemsets allow several kinds of information to be associated with one term. The advantage is the extraction of linguistic patterns with more expressiveness than the usual sequential patterns. In addition, the constraints enable to automatically prune irrelevant patterns. In order to manage the set of generated patterns, we propose a solution based on a partial ordering. A human user can thus easily validate them as relevant linguistic patterns. We illustrate the efficiency of our approach over two corpora coming from a newspaper.