Network-based heuristics for constraint-satisfaction problems
Artificial Intelligence
Grammatical category disambiguation by statistical optimization
Computational Linguistics
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
Finding clauses in unrestricted text by finitary and stochastic methods
ANLC '88 Proceedings of the second conference on Applied natural language processing
Syntactic approaches to automatic book indexing
ACL '88 Proceedings of the 26th annual meeting on Association for Computational Linguistics
Lexicon and grammar in probabilistic tagging of written English
ACL '88 Proceedings of the 26th annual meeting on Association for Computational Linguistics
PAT-tree-based keyword extraction for Chinese information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
The paper reports on a new approach to automatic generation of back-of-book indexes for Chinese books. Parsing on the level of complete sentential analysis is avoided because of the inefficiency and unavailability of a Chinese Grammar with enough coverage. Instead, fundamental analysis particular to Chinese text called word segmentation is performed to break up characters into a sequence of lexical units equivalent to words in English. The sequence of words then goes through part-of-speech tagging and noun phrase analysis. All these analyses are done using a corpus-based statistical algorithm. Experimental results have shown satisfactory results.