Probabilistic models in information retrieval
The Computer Journal - Special issue on information retrieval
Natural language processing for information retrieval
Communications of the ACM
The power of amnesia: learning probabilistic automata with variable memory length
Machine Learning - Special issue on COLT '94
Foundations of statistical natural language processing
Foundations of statistical natural language processing
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Principles of data mining
Machine Learning
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Introduction to Formal Language Theory
Introduction to Formal Language Theory
Understanding Probabilistic Classifiers
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Discriminative Feature Selection via Multiclass Variable Memory Markov Model
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Feature selection and feature extraction for text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
N-gram analysis based on zero-suppressed BDDs
JSAI'06 Proceedings of the 20th annual conference on New frontiers in artificial intelligence
Hi-index | 0.00 |
When dealing with knowledge federation over text documents one has to figure out whether or not documents are related by context. A new approach is proposed to solve this problem. This leads to the design of a new search engine for literature research and related problems. The idea is that one has already some documents of interest. These documents are taken as input. Then all documents known to a classical search engine are ranked according to their relevance. For achieving this goal we use Markov chains of variable length. The algorithms developed have been implemented and testing over the Reuters-21578 data set has been performed.