Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Hierarchical Text Classification and Evaluation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Evaluation of Techniques for Classifying Biological Sequences
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Text classification using string kernels
The Journal of Machine Learning Research
The Journal of Machine Learning Research
A neural probabilistic language model
The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
OCFS: optimal orthogonal centroid feature selection for text categorization
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Multi-labelled classification using maximum entropy method
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Semi-supervised protein classification using cluster kernels
Bioinformatics
The Locally Weighted Bag of Words Framework for Document Representation
The Journal of Machine Learning Research
A unified architecture for natural language processing: deep neural networks with multitask learning
Proceedings of the 25th international conference on Machine learning
Opinion Mining and Sentiment Analysis
Foundations and Trends in Information Retrieval
A hidden Markov model-based text classification of medical documents
Journal of Information Science
Proceedings of the 2009 workshop on Web Search Click Data
A General Framework of Feature Selection for Text Categorization
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
The Probabilistic Relevance Framework: BM25 and Beyond
Foundations and Trends in Information Retrieval
Learning Deep Architectures for AI
Learning Deep Architectures for AI
A study of information retrieval weighting schemes for sentiment analysis
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Sentiment classification based on supervised latent n-gram analysis
Proceedings of the 20th ACM international conference on Information and knowledge management
RBEM: a rule based approach to polarity detection
Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining
Integrated instance- and class-based generative modeling for text classification
Proceedings of the 18th Australasian Document Computing Symposium
Hi-index | 0.00 |
In this paper, we introduce a novel approach for modeling n-grams in a latent space learned from supervised signals. The proposed procedure uses only unigram features to model short phrases (n-grams) in the latent space. The phrases are then combined to form document-level latent representation for a given text, where position of an n-gram in the document is used to compute corresponding combining weight. The resulting two-stage supervised embedding is then coupled with a classifier to form an end-to-end system that we apply to the large-scale sentiment classification task. The proposed model does not require feature selection to retain effective features during pre-processing, and its parameter space grows linearly with size of n-gram. We present comparative evaluations of this method using two large-scale datasets for sentiment classification in online reviews (Amazon and TripAdvisor). The proposed method outperforms standard baselines that rely on bag-of-words representation populated with n-gram features.