Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Introduction to the CoNLL-2000 shared task: chunking
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Comparing clusterings---an information based distance
Journal of Multivariate Analysis
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Beam sampling for the infinite hidden Markov model
Proceedings of the 25th international conference on Machine learning
A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised and constrained Dirichlet process mixture models for verb clustering
GEMS '09 Proceedings of the Workshop on Geometrical Models of Natural Language Semantics
Improved unsupervised POS induction through prototype discovery
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Decision detection using hierarchical graphical models
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Improved unsupervised POS induction using intrinsic clustering quality and a Zipfian constraint
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Type level clustering evaluation: new measures and a POS induction case study
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Crouching Dirichlet, hidden Markov model: unsupervised POS tagging with context local tag generation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Two decades of unsupervised POS induction: how far have we come?
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Evaluating unsupervised learning for natural language processing tasks
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction
ACM Transactions on Asian Language Information Processing (TALIP)
A hierarchical dirichlet process model for joint part-of-speech and morphology induction
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
We extend previous work on fully unsupervised part-of-speech tagging. Using a non-parametric version of the HMM, called the infinite HMM (iHMM), we address the problem of choosing the number of hidden states in unsupervised Markov models for PoS tagging. We experiment with two non-parametric priors, the Dirichlet and Pitman-Yor processes, on the Wall Street Journal dataset using a parallelized implementation of an iHMM inference algorithm. We evaluate the results with a variety of clustering evaluation metrics and achieve equivalent or better performances than previously reported. Building on this promising result we evaluate the output of the unsupervised PoS tagger as a direct replacement for the output of a fully supervised PoS tagger for the task of shallow parsing and compare the two evaluations.