Automatic text processing
Constructing literature abstracts by computer: techniques and prospects
Information Processing and Management: an International Journal - Special issue on natural language processing and information retrieval
Information retrieval
Fundamentals of speech recognition
Fundamentals of speech recognition
Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Stemming algorithms: a case study for detailed evaluation
Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Viewing stemming as recall enhancement
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Experiments in multilingual information retrieval using the SPIDER system
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus-based stemming using cooccurrence of word variants
ACM Transactions on Information Systems (TOIS)
A stemming procedure and stopword list for general French corpora
Journal of the American Society for Information Science
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
The Effectiveness of a Graph-Based Algorithm for Stemming
ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
Unsupervised learning of the morphology of a natural language
Computational Linguistics
Stemming and decompounding for German text retrieval
ECIR'03 Proceedings of the 25th European conference on IR research
Design, implementation, and evaluation of a methodology for automatic stemmer generation
Journal of the American Society for Information Science and Technology
Autonomous authoring tools for hypertext
ACM Computing Surveys (CSUR)
A lemmatization method for Mongolian and its application to indexing for information retrieval
Information Processing and Management: an International Journal
STEMBR: a stemming algorithm for the Brazilian Portuguese language
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Poor man’s stemming: unsupervised recognition of same-stem words
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Translation techniques in cross-language information retrieval
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
In this paper, we present a method based on Hidden Markov Models (HMMs) to generate statistical stemmers. Using a list of words as training set, the method estimates the HMM parameters which are used to calculate the most probable stem for an arbitrary word. Stemming is performed by computing the most probable path, through the HMM states, corresponding to the input word. Linguistic knowledge or a training set of manually stemmed words are not required. We describe the method and the results of the experiments carried out using standard test collections for five different languages.