A statistical study of on-line learning
On-line learning in neural networks
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Unknown word extraction for Chinese documents
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Chinese unknown word identification using character-based tagging and chunking
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Statistically-enhanced new word identification in a rule-based Chinese system
CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
Accelerated training of conditional random fields with stochastic gradient methods
ICML '06 Proceedings of the 23rd international conference on Machine learning
Chinese segmentation and new word detection using conditional random fields
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
Modeling latent-dynamic in shallow parsing: a latent conditional model with improved inference
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A hybrid Markov/semi-Markov conditional random field for sequence segmentation
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Subword-based tagging by conditional random fields for Chinese word segmentation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
A discriminative latent variable chinese segmenter with hybrid word/character information
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Latent variable perceptron algorithm for structured classification
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
A Unified Character-Based Tagging Framework for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
Averaged Stochastic Gradient Descent with Feedback: An Accurate, Robust, and Fast Training Method
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Word-based and character-based word segmentation models: comparison and combination
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Large scale real-life action recognition using conditional random fields with stochastic training
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Chinese unknown word identification using class-based LM
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
A chunking strategy towards unknown word detection in chinese word segmentation
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Unknown word extraction from multilingual code-switching sentences
ROCLING '11 ROCLING 2011 Poster Papers
Unknown Chinese word extraction based on variety of overlapping strings
Information Processing and Management: an International Journal
Probabilistic Chinese word segmentation with non-local information and stochastic training
Information Processing and Management: an International Journal
Hi-index | 0.00 |
We present a joint model for Chinese word segmentation and new word detection. We present high dimensional new features, including word-based features and enriched edge (label-transition) features, for the joint modeling. As we know, training a word segmentation system on large-scale datasets is already costly. In our case, adding high dimensional new features will further slow down the training speed. To solve this problem, we propose a new training method, adaptive online gradient descent based on feature frequency information, for very fast online training of the parameters, even given large-scale datasets with high dimensional features. Compared with existing training methods, our training method is an order magnitude faster in terms of training time, and can achieve equal or even higher accuracies. The proposed fast training method is a general purpose optimization method, and it is not limited in the specific task discussed in this paper.