Information-based Case Grammar
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Fast and quasi-natural language search for gigabytes of Chinese texts
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Comparing representations in Chinese information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Chinese text retrieval without using a dictionary
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
PAT-tree-based keyword extraction for Chinese information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Word segmentation and recognition for web document framework
Proceedings of the eighth international conference on Information and knowledge management
On the use of words and n-grams for Chinese information retrieval
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Revision of Morphological Analysis Errors through the Person Name Construction Model
AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
A Hybrid Approach of Text Segmentation Based on Sensitive Word Concept for NLP
CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
Critical tokenization and its properties
Computational Linguistics
Splitting-merging model of Chinese word tokenization and segmentation
Natural Language Engineering
Applying repair processing in Chinese homophone disambiguation
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
CSeg& Tag1.0: a practical word segmenter and POS tagger for Chinese texts
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Chinese word segmentation without using lexicon and hand-crafted training data
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A stochastic finite-state word-segmentation algorithm for Chinese
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Automatic semantic classification for Chinese unknown compound nouns
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Chinese segmentation disambiguation
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Syllable-based model for the Korean morphology
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Character-based collocation for Mandarin Chinese
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
A Chinese corpus for linguistic research
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 4
Identification and classification of proper nouns in Chinese texts
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Segmentation standard for Chinese natural language processing
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
The head-modifier principle and multilingual term extraction
Natural Language Engineering
Unknown word extraction for Chinese documents
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Knowledge extraction for identification of Chinese organization names
CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
Sinica Treebank: design criteria, annotation guidelines, and on-line interface
CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
Statistically-enhanced new word identification in a rule-based Chinese system
CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
Learning case-based knowledge for disambiguating Chinese word segmentation: a preliminary study
SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
A bottom-up merging algorithm for Chinese unknown word extraction
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
ACM Transactions on Asian Language Information Processing (TALIP)
Using GHSOM to construct legal maps for Taiwan's securities and futures markets
Expert Systems with Applications: An International Journal
Comparing different units for query translation in Chinese cross-language information retrieval
Proceedings of the 2nd international conference on Scalable information systems
Engineering Applications of Artificial Intelligence
Expert Systems with Applications: An International Journal
Current research issues and trends in non-English Web searching
Information Retrieval
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Data-driven compound splitting method for english compounds in domain names
Proceedings of the 18th ACM conference on Information and knowledge management
Mining bilingual data from the web with adaptively learnt patterns
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Summary of FAQs from a topical forum based on the native composition structure
Expert Systems with Applications: An International Journal
Word-based and character-based word segmentation models: comparison and combination
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A case-based reasoning approach to zero anaphora resolution in chinese texts
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
ACM Transactions on Asian Language Information Processing (TALIP)
A fully automated web-based TV-News system
PCM'04 Proceedings of the 5th Pacific Rim conference on Advances in Multimedia Information Processing - Volume Part III
Boosting-based ensemble learning with penalty profiles for automatic Thai unknown word recognition
Computers & Mathematics with Applications
ROCLING '11 ROCLING 2011 Poster Papers
Hi-index | 0.00 |
Chinese sentences are composed with string of characters without blanks to mark words. However the basic unit for sentence parsing and understanding is word. Therefore the first step of processing Chinese sentences is to identify the words. The difficulties of identifying words include (1) the identification of complex words, such as Determinative-Measure, reduplications, derived words etc., (2) the identification of proper names, (3) resolving the ambiguous segmentations. In this paper, we propose the possible solutions for the above difficulties. We adopt a matching algorithm with 6 different heuristic rules to resolve the ambiguities and achieve an 99.77% of the success rate. The statistical data supports that the maximal matching algorithm is the most effective heuristics.