Results of applying probabilistic IR to OCR text
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A technique for computer detection and correction of spelling errors
Communications of the ACM
Improved string matching under noisy channel conditions
Proceedings of the tenth international conference on Information and knowledge management
Probabilistic models of information retrieval based on measuring the divergence from randomness
ACM Transactions on Information Systems (TOIS)
Exploiting syntactic analysis of queries for information retrieval
Data & Knowledge Engineering
A Common Solution for Tokenization and Part-of-Speech Tagging
TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
CIAA '01 Revised Papers from the 6th International Conference on Implementation and Application of Automata
Character N-Gram Tokenization for European Language Text Retrieval
Information Retrieval
A spelling correction program based on a noisy channel model
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Using contextual spelling correction to improve retrieval effectiveness in degraded text collections
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Pronunciation modeling for improved spelling correction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An improved error model for noisy channel spelling correction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Contextual spelling correction
EUROCAST'07 Proceedings of the 11th international conference on Computer aided systems theory
Current research issues and trends in non-English Web searching
Information Retrieval
Hi-index | 0.00 |
In this paper, we propose and evaluate two different alternatives to deal with degraded queries on Spanish IR applications. The first one is an n-gram-based strategy which has no dependence on the degree of available linguistic knowledge. On the other hand, we propose two spelling correction techniques, one of which has a strong dependence on a stochastic model that must be previously built from a POS-tagged corpus. In order to study their validity, a testing framework has been formally designed and applied on both approaches.