A New Probabilistic Model of Text Classification and Retrieval TITLE2:

Authors:
T. Kalt
Affiliations:
-
Venue:
A New Probabilistic Model of Text Classification and Retrieval TITLE2:
Year:
1998

Citing 0
Cited 9

Text categorization for multi-page documents: a hybrid naive Bayes HMM approach

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Hidden Markov Models for Text Categorization in Multi-Page Documents

Journal of Intelligent Information Systems
ACIRD: Intelligent Internet Document Organization and Retrieval

IEEE Transactions on Knowledge and Data Engineering
Empirical development of an exponential probabilistic model for text retrieval: using textual analysis to build a better model

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Dominant meanings classification model for web information

Design and application of hybrid intelligent systems
A risk minimization framework for information retrieval

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Using hypothesis margin to boost centroid text classifier

Proceedings of the 2007 ACM symposium on Applied computing
UVA: language modeling techniques for web people search

SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
A novel neighborhood based document smoothing model for information retrieval

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces the multinomial model of text classification and retrieval. One important feature of the model is that the tf statistic, which usually appears in probabilistic IR models as a heuristic, is an integral part of the model. Another is that the variable length of documents is accounted for, without either making a uniform length assumption or using length normalization. The multinomial model employs independence assumptions which are similar to assumptions made in previous probabilistic models, particularly the binary independence model and the 2-Poisson model. The use of simulation to study the model is described. Performance of the model is evaluated on the TREC-3 routing task. Results are compared with the binary independence model and with the simulation studies.