Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
A vector space model for automatic indexing
Communications of the ACM
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Journal of Intelligent Information Systems
A Hierarchical Model for Clustering and Categorising Documents
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Variational Extensions to EM and Multinomial PCA
ECML '02 Proceedings of the 13th European Conference on Machine Learning
The Journal of Machine Learning Research
A neural probabilistic language model
The Journal of Machine Learning Research
Links between perceptrons, MLPs and SVMs
ICML '04 Proceedings of the twenty-first international conference on Machine learning
The VLDB Journal — The International Journal on Very Large Data Bases
Investigating lexical substitution scoring for subtitle generation
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Proceedings of the 18th ACM conference on Information and knowledge management
The adaptive web
Learning to rank with (a lot of) word features
Information Retrieval
A novel neighborhood based document smoothing model for information retrieval
Information Retrieval
Hi-index | 0.00 |
Text categorization and retrieval tasks are often based on a good representation of textual data. Departing from the classical vector space model, several probabilistic models have been proposed recently, such as PLSA. In this paper, we propose the use of a neural network based, non-probabilistic, solution, which captures jointly a rich representation of words and documents. Experiments performed on two information retrieval tasks using the TDT2 database and the TREC-8 and 9 sets of queries yielded a better performance for the proposed neural network model, as compared to PLSA and the classical TFIDF representations.