Guessing morphology from terms and corpora
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
An introduction to Kolmogorov complexity and its applications (2nd ed.)
An introduction to Kolmogorov complexity and its applications (2nd ed.)
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
The discovery of algorithmic probability: A guide for the programming of true creativity
EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Induction of Slovene Nominal Paradigms
ILP '97 Proceedings of the 7th International Workshop on Inductive Logic Programming
Unsupervised language acquisition
Unsupervised language acquisition
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The Effectiveness of a Graph-Based Algorithm for Stemming
ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
A novel method for stemmer generation based on hidden markov models
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
How Effective is Stemming and Decompounding for German Text Retrieval?
Information Retrieval
Mostly-unsupervised statistical segmentation of Japanese Kanji sequences
Natural Language Engineering
A probabilistic model for stemmer generation
Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Modelling highly inflected languages
Information Sciences—Informatics and Computer Science: An International Journal
A Bayesian model for morpheme and paradigm identification
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Unsupervised learning of morphology for English and Inuktitut
NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Unsupervised segmentation of words using prior distributions of morph length and frequency
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Unsupervised learning of Arabic stemming using a parallel corpus
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Language model based arabic word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Unsupervised learning of morphology for building lexicon for a highly inflectional language
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Unsupervised learning of morphology using a novel directed search algorithm: taking the first step
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Unsupervised discovery of morphemes
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Unsupervised learning of morphology without morphemes
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Using eigenvectors of the bigram graph to infer morpheme identity
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Modeling english past tense intuitions with minimal generalization
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Bootstrapping a multilingual part-of-speech tagger in one person-day
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Using 'smart' bilingual projection to feature-tag a monolingual dictionary
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Swordfish: an unsupervised Ngram based approach to morphological analysis
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
An algorithm for the unsupervised learning of morphology
Natural Language Engineering
Unsupervised models for morpheme segmentation and morphology learning
ACM Transactions on Speech and Language Processing (TSLP)
A framework for unsupervised natural language morphology induction
ACLstudent '04 Proceedings of the ACL 2004 workshop on Student research
Efficient unsupervised recursive word segmentation using minimum description length
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Design, implementation, and evaluation of a methodology for automatic stemmer generation
Journal of the American Society for Information Science and Technology
YASS: Yet another suffix stripper
ACM Transactions on Information Systems (TOIS)
Morph-based speech recognition and modeling of out-of-vocabulary words across languages
ACM Transactions on Speech and Language Processing (TSLP)
Part-of-speech tagging of modern hebrew text
Natural Language Engineering
Acquisition of Morphology of an Indic Language from Text Corpus
ACM Transactions on Asian Language Information Processing (TALIP)
An unsupervised Hindi stemmer with heuristic improvements
Proceedings of the second workshop on Analytics for noisy unstructured text data
Division of Spanish Words into Morphemes with a Genetic Algorithm
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Computational Linguistics
Discovery of underlying morphological relations using an agglomerative clustering algorithm
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
A nearest-neighbor approach to the automatic analysis of ancient Greek morphology
CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Unsupervised discovery of Persian morphemes
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Learning-based named entity recognition for morphologically-rich, resource-scarce languages
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Weakly supervised part-of-speech tagging for morphologically-rich, resource-scarce languages
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A new objective function for word alignment
ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Automatic lexical acquisition from raw corpora: an application to Russian
MorphSlav '03 Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages
Unsupervised learning of Bulgarian POS tags
MorphSlav '03 Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages
Unsupervised morphological segmentation with log-linear models
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Non-locality all the way through: emergent global constraints in the Italian morphological lexicon
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Priors in Bayesian learning of phonological rules
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Induction of a simple morphology for highly-inflecting languages
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Unsupervised induction of natural language morphology inflection classes
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Multilingual noise-robust supervised morphological analysis using the WordFrame model
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Exploring variant definitions of pointer length in MDL
SIGPHON '06 Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology
Richness of the base and probabilistic unsupervised learning in optimality theory
SIGPHON '06 Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology
Morphology induction from limited noisy data using approximate string matching
SIGPHON '06 Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology
Learning probabilistic paradigms for morphology in a latent class model
SIGPHON '06 Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology
A naive theory of affixation and an algorithm for extraction
SIGPHON '06 Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology
Evaluating an agglutinative segmentation model for ParaMor
SigMorPhon '08 Proceedings of the Tenth Meeting of ACL Special Interest Group on Computational Morphology and Phonology
ParaMor: minimally supervised induction of paradigm structure and morphological analysis
SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
TextGraphs-3 Proceedings of the 3rd Textgraphs Workshop on Graph-Based Algorithms for Natural Language Processing
How the statistical revolution changes (computational) linguistics
ILCL '09 Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?
Automatic Speech-to-Text Transcription in Arabic
ACM Transactions on Asian Language Information Processing (TALIP)
Using an ant colony metaheuristic to optimize automatic word segmentation for ancient Greek
IEEE Transactions on Evolutionary Computation
Using morphology and syntax together in unsupervised learning
PMHLA '05 Proceedings of the Workshop on Psychocomputational Models of Human Language Acquisition
The SED heuristic for morpheme discovery: a look at Swahili
PMHLA '05 Proceedings of the Workshop on Psychocomputational Models of Human Language Acquisition
Implementation of a multi-objective genetic algorithm on word segmentation in modern Greek
ASC '07 Proceedings of The Eleventh IASTED International Conference on Artificial Intelligence and Soft Computing
Improving morphology induction by learning spelling rules
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Minimized models for unsupervised part-of-speech tagging
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Unsupervised morphological segmentation and clustering with document boundaries
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Unsupervised tokenization for machine translation
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Combining MDL transliteration training with discriminative modeling
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Comparing learners for Boolean partitions: implications for morphological paradigms
CLAGI '09 Proceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference
Inducing Morphemes Using Light Knowledge
ACM Transactions on Asian Language Information Processing (TALIP)
Stemming and decompounding for German text retrieval
ECIR'03 Proceedings of the 25th European conference on IR research
Sub-Word Indexing and Blind Relevance Feedback for English, Bengali, Hindi, and Marathi IR
ACM Transactions on Asian Language Information Processing (TALIP)
Term weighting schemes for Latent Dirichlet Allocation
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Subword variation in text message classification
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Minimized models and grammar-informed initialization for supertagging with highly ambiguous lexicons
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Improved unsupervised POS induction through prototype discovery
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Unsupervised construction of a multilingual WordNet from parallel corpora
MCTLLL '09 Proceedings of the Workshop on Natural Language Processing Methods and Corpora in Translation, Lexicography, and Language Learning
Learning rules and categorization networks for language standardization
EUCCL '10 Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics
Semi-supervised learning of concatenative morphology
SIGMORPHON '10 Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology
Predicting the semantic compositionality of prefix verbs
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Weakly supervised morphology learning for agglutinating languages using small training sets
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
EMMA: a novel Evaluation Metric for Morphological Analysis
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Architecture of Hascheck: an intelligent spellchecker for croatian language
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
Unsupervised morpheme analysis with allomorfessor
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Unsupervised word decomposition with the promodes algorithm
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Unsupervised morpheme discovery with ungrade
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Clustering morphological paradigms using syntactic categories
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Simulating morphological analyzers with stochastic taggers for confidence estimation
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
A rule-based acquisition model adapted for morphological analysis
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Selected operations and applications of n-tape weighted finite-state machines
FSMNLP'09 Proceedings of the 8th international conference on Finite-state methods and natural language processing
ACM Transactions on Asian Language Information Processing (TALIP)
An information-theoretic, vector-space-model approach to cross-language information retrieval*
Natural Language Engineering
LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
A novel corpus-based stemming algorithm using co-occurrence statistics
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Producing Power-Law Distributions and Damping Word Frequencies with Two-Stage Language Models
The Journal of Machine Learning Research
Research on Language and Computation
Research on Language and Computation
GRAS: An effective and efficient stemming algorithm for information retrieval
ACM Transactions on Information Systems (TOIS)
Morphological lexicon extraction from raw text data
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
What we know about the Voynich manuscript
LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Poor man’s stemming: unsupervised recognition of same-stem words
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Automatic recognition of czech derivational prefixes
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Universal morphological analysis using structured nearest neighbor prediction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Discovering morphological paradigms from plain text using a Dirichlet process mixture model
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised word categorization using self-organizing maps and automatically extracted morphs
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Optimal stem identification in presence of suffix list
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Natural language technology and query expansion: issues, state-of-the-art and perspectives
Journal of Intelligent Information Systems
Probabilistic hierarchical clustering of morphological paradigms
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Arabic retrieval revisited: morphological hole filling
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
The study of effect of length in morphological segmentation of agglutinative languages
MM '12 Proceedings of the First Workshop on Multilingual Modeling
Ranking and selection of unsupervised learning marketing segmentation
Knowledge-Based Systems
Semi-automatic acquisition of two-level morphological rules for iban language
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Computational Intelligence - Volume Part II
Effective and Robust Query-Based Stemming
ACM Transactions on Information Systems (TOIS)
Aligned-Parallel-Corpora Based Semi-Supervised Learning for Arabic Mention Detection
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
This study reports the results of using minimum description length (MDL) analysis to model unsupervised learning of the morphological segmentation of European languages, using corpora ranging in size from 5,000 words to 500,000 words. We develop a set of heuristics that rapidly develop a probabilistic morphological grammar, and use MDL as our primary tool to determine whether the modifications proposed by the heuristics will be adopted or not. The resulting grammar matches well the analysis that would be developed by a human morphologist.In the final section, we discuss the relationship of this style of MDL grammatical analysis to the notion of evaluation metric in early generative grammar.