A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Self-Supervised Chinese Word Segmentation
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
A compression-based algorithm for Chinese word segmentation
Computational Linguistics
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Improved source-channel models for Chinese word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Statistically-enhanced new word identification in a rule-based Chinese system
CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Chinese lexical analysis using hierarchical hidden Markov model
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
The first international Chinese word segmentation Bakeoff
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Chinese word segmentation in MSR-NLP
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Dependency tree kernels for relation extraction
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Improving discriminative sequential learning with rare--but--important associations
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach
Computational Linguistics
Discriminative pruning of language models for Chinese word segmentation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Context-based morphological disambiguation with random fields
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Improving discriminative sequential learning by discovering important association of statistics
ACM Transactions on Asian Language Information Processing (TALIP)
Subword-based tagging for confidence-dependent Chinese word segmentation
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Chinese word segmentation as morpheme-based lexical chunking
Information Sciences: an International Journal
Chinese word segmentation and statistical machine translation
ACM Transactions on Speech and Language Processing (TSLP)
Unsupervised query segmentation using generative language models and wikipedia
Proceedings of the 17th international conference on World Wide Web
Chinese Word Segmentation for Terrorism-Related Contents
PAISI, PACCF and SOCO '08 Proceedings of the IEEE ISI 2008 PAISI, PACCF, and SOCO international workshops on Intelligence and Security Informatics
TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Minimum tag error for discriminative training of conditional random fields
Information Sciences: an International Journal
LearnLexTo: a machine-learning based word segmentation for indexing Thai texts
Proceedings of the 2nd ACM workshop on Improving non english web searching
Combining Language Modeling and Discriminative Classification for Word Segmentation
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
A Simple and Efficient Model Pruning Method for Conditional Random Fields
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Scaling conditional random fields by one-against-the-other decomposition
Journal of Computer Science and Technology
Training conditional random fields using incomplete annotations
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Character-level dependencies in Chinese: usefulness and learning
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A hybrid Markov/semi-Markov conditional random field for sequence segmentation
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Online acquisition of Japanese unknown morphemes using morphological constraints
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Subword-based tagging by conditional random fields for Chinese word segmentation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
A discriminative latent variable chinese segmenter with hybrid word/character information
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A dual-layer CRFs based joint decoding method for cascaded segmentation and labeling tasks
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Improved statistical machine translation by multiple Chinese word segmentation
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Optimizing Chinese word segmentation for machine translation performance
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Semi-supervised learning of semantic classes for query understanding: from the web and for the web
Proceedings of the 18th ACM conference on Information and knowledge management
Incorporating user behaviors in new word detection
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Punctuation as implicit annotations for chinese word segmentation
Computational Linguistics
Expert Systems with Applications: An International Journal
SEDE: An ontology for scholarly event description
Journal of Information Science
A Unified Character-Based Tagging Framework for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
Domain adaptation for conditional random fields
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Integrating unsupervised and supervised word segmentation: The role of goodness measures
Information Sciences: an International Journal
A character-based joint model for Chinese word segmentation
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Learning to tokenize web domains
Proceedings of the 20th international conference companion on World wide web
Labelwise margin maximization for sequence labeling
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Domain-specific Chinese word segmentation using suffix tree and mutual information
Information Systems Frontiers
Syntactic processing using the generalized perceptron and beam search
Computational Linguistics
Chinese new word identification: a latent discriminative model with global features
Journal of Computer Science and Technology - Special issue on natural language processing
Pointwise prediction for robust, adaptable Japanese morphological analysis
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective
ACM Transactions on Asian Language Information Processing (TALIP)
Non-parametric bayesian segmentation of Japanese noun phrases
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Bootstrapped named entity recognition for product attribute extraction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Integrating Generative and Discriminative Character-Based Models for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
A class-based agreement model for generating accurately inflected translations
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Phrase-based approach for adaptive tokenization
SIGMORPHON '12 Proceedings of the Twelfth Meeting of the Special Interest Group on Computational Morphology and Phonology
Part-of-speech tagging for Chinese-English mixed texts with dynamic features
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Segmenting web-domains and hashtags using length specific models
Proceedings of the 21st ACM international conference on Information and knowledge management
Automatic Korean word spacing using Pegasos algorithm
Information Processing and Management: an International Journal
ACM Transactions on Asian Language Information Processing (TALIP)
Unknown Chinese word extraction based on variety of overlapping strings
Information Processing and Management: an International Journal
Probabilistic Chinese word segmentation with non-local information and stochastic training
Information Processing and Management: an International Journal
An empirical study on word segmentation for chinese machine translation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
The application of kalman filter based human-computer learning model to chinese word segmentation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Design and analysis of genetic algorithm based Chinese keyword extracting
International Journal of Computer Applications in Technology
Chinese-Japanese Machine Translation Exploiting Chinese Characters
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
Chinese word segmentation is a difficult, important and widely-studied sequence modeling problem. This paper demonstrates the ability of linear-chain conditional random fields (CRFs) to perform robust and accurate Chinese word segmentation by providing a principled framework that easily supports the integration of domain knowledge in the form of multiple lexicons of characters and words. We also present a probabilistic new word detection method, which further improves performance. Our system is evaluated on four datasets used in a recent comprehensive Chinese word segmentation competition. State-of-the-art performance is obtained.