Toward a totally unsupervised, language-independent method for the syllabification of written texts

Authors:
Thomas Mayer
Affiliations:
University of Konstanz, Germany
Venue:
SIGMORPHON '10 Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology
Year:
2010

Citing 4
Cited 0

The application of Sukhotin's algorithm to certain non-English languages

Cryptologia
Vowel identification: an old (but good) algorithm

Cryptologia
Linguistically naïve != language independent: why NLP needs linguistic typology

ILCL '09 Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?
Representational bias in unsupervised learning of syllable structure

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Unsupervised algorithms for the induction of linguistic knowledge should at best require as few basic assumptions as possible and at the same time in principle yield good results for any language. However, most of the time such algorithms are only tested on a few (closely related) languages. In this paper, an approach is presented that takes into account typological knowledge in order to induce syllabic divisions in a fully automatic manner based on reasonably-sized written texts. Our approach is able to account for syllable structures of languages where other approaches would fail, thereby raising the question whether computational methods can really be claimed to be language-universal when they are not tested on the variety of structures that are found in the languages of the world.