Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Computing the SVD of a General Matrix Product/Quotient
SIAM Journal on Matrix Analysis and Applications
EPIA '99 Proceedings of the 9th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
The Journal of Machine Learning Research
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
Computational Linguistics
Finding structural correspondences from bilingual parsed corpus for corpus-based translation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Structure alignment using bilingual chunking
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Word translation disambiguation using Bilingual Bootstrapping
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Word alignment of English-Chinese bilingual corpus based on chunks
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Hi-index | 0.00 |
An integrated method for bilingual chunk partition andalignment, called “Interactional Matching”, is proposed in this paper. Different from former works, our method tries to get as necessary information as possible from the bilingual corpora themselves, and through bilingual constraint it can automatically build one-to-one chunk-pairs associated with the chunk-pair confidence coefficients. Also, our method partitions bilingual sentences entirely into chunks with no fragments left, different from collocation extracting methods. Furthermore, with the technology of Probabilistic Latent Semantic Indexing(PLSI), this method can deal with not only compositional chunks, but also non-compositional ones. The experiments show that, for overall process (including partition and alignment), our method can obtain 85% precision with 57% recall for the written language chunk-pairs and 78% precision with 53% recall for the spoken language chunk-pairs.