Data nonlinearity in exploratory multivariate analysis of language corpora

  • Authors:
  • Hermann Moisl

  • Affiliations:
  • University of Newcastle, Newcastle upon Tyne, United Kingdom

  • Venue:
  • SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data nonlinearity has historically not been and currently is not an issue in work on exploratory multivariate analysis of language corpora. However, the presence of nonlinearity in data has a fundamental bearing on the conduct of exploratory analysis. The first part of the discussion explains why this is so in principle, and the second exemplifies the explanation via exploratory analysis of the Newcastle Electronic Corpus of Tyneside English (NECTE), an historical speech corpus. The conclusion is that data should be screened for nonlinearity prior to analysis and, if a substantial degree of it is found, a nonlinear analytical method should be used.