A corpus-based approach to language learning
A corpus-based approach to language learning
Chinese text segmentation for text retrieval: achievements and problems
Journal of the American Society for Information Science
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Information Retrieval
Machine Learning
USe: A Retargetable Word Segmentation Procedure for Information Retrieval
USe: A Retargetable Word Segmentation Procedure for Information Retrieval
Improving Chinese tokenization with linguistic filters on statistical lexical acquisition
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Efficient transformation-based parsing
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A rule-based approach to prepositional phrase attachment disambiguation
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Chinese segmentation disambiguation
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Finite-state phrase parsing by rule sequences
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Enhancing access to the levy sheet music collection: reconstructing full-text lyrics from syllables
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Learning pattern rules for Chinese named entity extraction
Eighteenth national conference on Artificial intelligence
A compression-based algorithm for Chinese word segmentation
Computational Linguistics
Mostly-unsupervised statistical segmentation of Japanese Kanji sequences
Natural Language Engineering
Mostly-unsupervised statistical segmentation of Japanese: applications to kanji
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Chinese word segmentation without using lexicon and hand-crafted training data
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A maximum-entropy chinese parser augmented by transformation-based learning
ACM Transactions on Asian Language Information Processing (TALIP)
An agent-based approach to Chinese named entity recognition
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Chinese text segmentation with MBDP-1: making the most of training corpora
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Using existing systems to supplement small amounts of annotated grammatical relations training data
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach
Computational Linguistics
Multidimensional transformation-based learning
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Learning case-based knowledge for disambiguating Chinese word segmentation: a preliminary study
SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
A bottom-up merging algorithm for Chinese unknown word extraction
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
CHINERS: a Chinese named entity recognition system for the sports domain
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Chinese lexical analysis using hierarchical hidden Markov model
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Integrating ngram model and case-based learning for Chinese word segmentation
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
A maximum entropy Chinese character-based parser
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Adaptive Chinese word segmentation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Chinese word segmentation as morpheme-based lexical chunking
Information Sciences: an International Journal
Chinese Word Segmentation for Terrorism-Related Contents
PAISI, PACCF and SOCO '08 Proceedings of the IEEE ISI 2008 PAISI, PACCF, and SOCO international workshops on Intelligence and Security Informatics
TBL-improved non-deterministic segmentation and POS tagging for a Chinese parser
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Recognize person names from Chinese texts based on clustering SVM
ISC '07 Proceedings of the 10th IASTED International Conference on Intelligent Systems and Control
A Unified Character-Based Tagging Framework for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
Domain-specific Chinese word segmentation using suffix tree and mutual information
Information Systems Frontiers
A new unsupervised approach to word segmentation
Computational Linguistics
Integrating Generative and Discriminative Character-Based Models for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
This paper presents a trainable rule-based algorithm for performing word segmentation. The algorithm provides a simple, language-independent alternative to large-scale lexical-based segmenters requiring large amounts of knowledge engineering. As a stand-alone segmenter, we show our algorithm to produce high performance Chinese segmentation. In addition, we show the transformation-based algorithm to be effective in improving the output of several existing word segmentation algorithms in three different languages.