Effective text compression with simultaneous digram and trigram encoding
Journal of Information Science
Information retrieval
Using n-grams for Korean text retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Computer programs for detecting and correcting spelling errors
Communications of the ACM
Probabilistic Retrieval of OCR Degraded Text Using N-Grams
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Character contiguity in N-gram-based word matching: the case for Arabic text searching
Information Processing and Management: an International Journal
A novel Arabic lemmatization algorithm
Proceedings of the second workshop on Analytics for noisy unstructured text data
Managing misspelled queries in IR applications
Information Processing and Management: an International Journal
Improving Arabic information retrieval system using N-gram method
WSEAS Transactions on Computers
Effect of ISRI stemming on similarity measure for arabic document clustering
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Hi-index | 0.00 |
N-grams have been widely investigated for a number of text processing and retrieval applications. This article examines the performance of the digram and trigram term conflation techniques in the context of Arabic free text retrieval. It reports the results of using the N-gram approach for a corpus of thousands of distinct textual words drawn from a number of sources representing various disciplines. The results indicate that the digram method offers a better performance than trigram with respect to conflation precision and conflation recall ratios. In either case, the N-gram approach does not appear to provide an efficient conflation approach due to the peculiarities imposed by the Arabic infix structure that reduces the rate of correct N-gram matching.