Modeling and learning multilingual inflectional morphology in a minimally supervised framework
Modeling and learning multilingual inflectional morphology in a minimally supervised framework
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Hunmorph: open source word analysis
Software '05 Proceedings of the Workshop on Software
Overview of Morpho challenge 2008
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Simulating morphological analyzers with stochastic taggers for confidence estimation
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Accurate unsupervised joint named-entity extraction from unaligned parallel text
NEWS '12 Proceedings of the 4th Named Entity Workshop
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Hi-index | 0.00 |
In biological sequence processing, Multiple Sequence Alignment (MSA) techniques capture information about long-distance dependencies and the three-dimensional structure of protein and nucleotide sequences without resorting to polynomial complexity context-free models. But MSA techniques have rarely been used in natural language (NL) processing, and never for NL morphology induction. Our MetaMorph algorithm is a first attempt at leveraging MSA techniques to induce NL morphology in an unsupervised fashion. Given a text corpus in any language, MetaMorph sequentially aligns words of the corpus to form an MSA and then segments the MSA to produce morphological analyses. Over corpora that contain millions of unique word types, MetaMorph identifies morphemes at an F1 below state-of-the-art performance. But when restricted to smaller sets of orthographically related words, Meta-Morph outperforms the state-of-the-art ParaMor-Morfessor Union morphology induction system. Tested on 5,000 orthographically similar Hungarian word types, MetaMorph reaches 54.1% and ParaMor-Morfessor just 41.9%. Hence, we conclude that MSA is a promising algorithm for unsupervised morphology induction. Future research directions are discussed.