Automatic text processing
Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using lexical-semantic relations
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Natural language vs. Boolean query evaluation: a comparison of retrieval performance
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Stemming algorithms: a case study for detailed evaluation
Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Viewing stemming as recall enhancement
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Resolving ambiguity for cross-language retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The impact on retrieval effectiveness of skewed frequency distributions
ACM Transactions on Information Systems (TOIS)
An algorithm for term conflation based on tree structures
Journal of the American Society for Information Science and Technology
Visualizing content based relations in texts
AUIC '01 Proceedings of the 2nd Australasian conference on User interface
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
On arabic search: improving the retrieval effectiveness via a light stemming approach
Proceedings of the eleventh international conference on Information and knowledge management
Information Retrieval
Automatic discovery of similarity relationships through Web mining
Decision Support Systems - Web retrieval and mining
Automatic Profile Reformulation Using a Local Document Analysis
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Automatic Acquisition of Morphological Knowledge for Medical Language Processing
AIMDM '99 Proceedings of the Joint European Conference on Artificial Intelligence in Medicine and Medical Decision Making
Automatic Language-Specific Stemming in Information Retrieval
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
Pattern extraction method for text classification
Technologies for constructing intelligent systems
Probabilistic term variant generator for biomedical terms
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Letter to the editor: the practice and malpractice of stemming
Journal of the American Society for Information Science and Technology
A novel method for stemmer generation based on hidden markov models
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Building an inflectional stemmer for Bulgarian
CompSysTech '03 Proceedings of the 4th international conference conference on Computer systems and technologies: e-Learning
Arabic morphological analysis techniques: a comprehensive survey
Journal of the American Society for Information Science and Technology
Scoring missing terms in information retrieval tasks
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Empirical studies on the impact of lexical resources on CLIR performance
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Using similarity scoring to improve the bilingual dictionary for word alignment
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Knowledge-free induction of inflectional morphologies
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
A categorial variation database for English
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Optimizing story link detection is not equivalent to optimizing new event detection
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Discourse segmentation of multi-party conversation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
ACSC '05 Proceedings of the Twenty-eighth Australasian conference on Computer Science - Volume 38
An Approach for Stemming in Symbolically Compressed Indian Language Imaged Documents
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Cross-lingual information retrieval using hidden Markov models
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
A framework for understanding latent semantic indexing (LSI) performance
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Qualitative evaluation of automatic assignment of keywords to images
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Light stemming approaches for the French, Portuguese, German and Hungarian languages
Proceedings of the 2006 ACM symposium on Applied computing
Design, implementation, and evaluation of a methodology for automatic stemmer generation
Journal of the American Society for Information Science and Technology
Argumentative feedback: a linguistically-motivated term expansion for information retrieval
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Context sensitive stemming for web search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
YASS: Yet another suffix stripper
ACM Transactions on Information Systems (TOIS)
Restricted inflectional form generation in management of morphological keyword variation
Information Retrieval
Searching strategies for the Hungarian language
Information Processing and Management: an International Journal
Stemming Indonesian: A confix-stripping approach
ACM Transactions on Asian Language Information Processing (TALIP)
Automatic acquisition of inflectional lexica for morphological normalisation
Information Processing and Management: an International Journal
Topic models and a revisit of text-related applications
Proceedings of the 2nd PhD workshop on Information and knowledge management
A class-feature-centroid classifier for text categorization
Proceedings of the 18th international conference on World wide web
Current research issues and trends in non-English Web searching
Information Retrieval
A lemmatization method for Mongolian and its application to indexing for information retrieval
Information Processing and Management: an International Journal
Addressing morphological variation in alphabetic languages
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Unsupervised learning of the morpho-semantic relationship in MEDLINE®
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Indexing and stemming approaches for the Czech language
Information Processing and Management: an International Journal
An evaluation study of clustering algorithms in the scope of user communities assessment
Computers & Mathematics with Applications
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Leveraging Higher Order Dependencies between Features for Text Classification
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
TextGraphs-3 Proceedings of the 3rd Textgraphs Workshop on Graph-Based Algorithms for Natural Language Processing
Indexing and searching strategies for the Russian language
Journal of the American Society for Information Science and Technology
A higher order collective classifier for detecting andclassifying network events
ISI'09 Proceedings of the 2009 IEEE international conference on Intelligence and security informatics
Morphology induction from term clusters
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
A framework for understanding Latent Semantic Indexing (LSI) performance
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Qualitative evaluation of automatic assignment of keywords to images
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Automatic morphological query expansion using analogy-based machine learning
ECIR'07 Proceedings of the 29th European conference on IR research
Semantic similarity measures for Malay sentences
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
Comparative Study of Indexing and Search Strategies for the Hindi, Marathi, and Bengali Languages
ACM Transactions on Asian Language Information Processing (TALIP)
Sub-Word Indexing and Blind Relevance Feedback for English, Bengali, Hindi, and Marathi IR
ACM Transactions on Asian Language Information Processing (TALIP)
Digitization of Indian literature: problem and solution
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
RALI: Automatic weighting of text window distances
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Towards an optimal weighting of context words based on distance
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
An accuracy-enhanced light stemmer for arabic text
ACM Transactions on Speech and Language Processing (TSLP)
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
ACM Transactions on Asian Language Information Processing (TALIP)
Implementation of a new method for stemming in Persian language
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
A novel corpus-based stemming algorithm using co-occurrence statistics
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
An unsupervised method to improve Spanish stemmer
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
GRAS: An effective and efficient stemming algorithm for information retrieval
ACM Transactions on Information Systems (TOIS)
University of Otago at INEX 2010
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Distribution based stemmer refinement
PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence
Text classification using small number of features
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
New algorithms on wavelet trees and applications to information retrieval
Theoretical Computer Science
Semantically enhanced text stemmer (SETS) for cross-domain document clustering
KES'12 Proceedings of the 16th international conference on Knowledge Engineering, Machine Learning and Lattice Computing with Applications
A corpus based approach for the automatic creation of arabic broken plural dictionaries
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Extraction of financial information from online business reports
ACM SIGMIS Database
Effective and Robust Query-Based Stemming
ACM Transactions on Information Systems (TOIS)
Proceedings of the Fourth Symposium on Information and Communication Technology
Enhanced cross-domain document clustering with a semantically enhanced text stemmer SETS
International Journal of Knowledge-based and Intelligent Engineering Systems - Selected papers of KES2012-Part 2 of 2
Hi-index | 0.00 |
Stemming is used in many information retrieval (IR) systems to reduce variant word forms to common roots. It is one of the simplest applications of natural-language processing to IR and is one of the most effective in terms of user acceptance and consistency, though small retrieval improvements. Current stemming techniques do not, however, reflect the language use in specific corpora, and this can lead to occasional serious retrieval failures. We propose a technique for using corpus-based word variant cooccurrence statistics to modify or create a stemmer. The experimental results generated using English newspaper and legal text and Spanish text demonstrate the viability of this technique and its advantages relative to conventional approaches that only employ morphological rules.