A maximum entropy approach to natural language processing
Computational Linguistics
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A technique for computer detection and correction of spelling errors
Communications of the ACM
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Computational Linguistics
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Discriminative training and maximum entropy models for statistical machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
Espresso: leveraging generic patterns for automatically harvesting semantic relations
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Exploring distributional similarity based models for query spelling correction
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A hybrid back-transliteration system for Japanese
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Automatic construction of Japanese KATAKANA variant list from large corpus
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Learning a spelling error model from search query logs
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Managing misspelled queries in IR applications
Information Processing and Management: an International Journal
Latent class transliteration based on source language origin
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Latent semantic transliteration using dirichlet mixture
NEWS '12 Proceedings of the 4th Named Entity Workshop
Hi-index | 0.00 |
We propose a unified approach to web search query alterations in Japanese that is not limited to particular character types or orthographic similarity between a query and its alteration candidate. Our model is based on previous work on English query correction, but makes some crucial improvements: (1) we augment the query-candidate list to include orthographically dissimilar but semantically similar pairs; and (2) we use kernel-based lexical semantic similarity to avoid the problem of data sparseness in computing query-candidate similarity. We also propose an efficient method for generating query-candidate pairs for model training and testing. We show that the proposed method achieves about 80% accuracy on the query alteration task, improving over previously proposed methods that use semantic similarity.