Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
The nature of statistical learning theory
The nature of statistical learning theory
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Selection for Unbalanced Class Distribution and Naive Bayes
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Measuring praise and criticism: Inference of semantic orientation from association
ACM Transactions on Information Systems (TOIS)
Predicting the semantic orientation of adjectives
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Feature selection for text categorization on imbalanced data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
An adaptive k-nearest neighbor text categorization strategy
ACM Transactions on Asian Language Information Processing (TALIP)
Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches
HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 4 - Volume 04
Introducing a Family of Linear Measures for Feature Selection in Text Categorization
IEEE Transactions on Knowledge and Data Engineering
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
Thumbs up?: sentiment classification using machine learning techniques
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Determining the sentiment of opinions
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Recognizing contextual polarity in phrase-level sentiment analysis
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
The class imbalance problem: A systematic study
Intelligent Data Analysis
A Hybrid Method of Feature Selection for Chinese Text Sentiment Classification
FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 03
An empirical study of sentiment analysis for chinese documents
Expert Systems with Applications: An International Journal
Imbalanced text classification: A term weighting approach
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Cluster-based under-sampling approaches for imbalanced data distributions
Expert Systems with Applications: An International Journal
Discovering unexpected documents in corpora
Knowledge-Based Systems
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
On strategies for imbalanced text classification using SVM: A comparative study
Decision Support Systems
Exploratory undersampling for class-imbalance learning
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Ensemble of feature sets and classification algorithms for sentiment classification
Information Sciences: an International Journal
Expert Systems with Applications: An International Journal
Lexicon-based methods for sentiment analysis
Computational Linguistics
Imbalanced sentiment classification
Proceedings of the 20th ACM international conference on Information and knowledge management
Bilingual co-training for sentiment classification of chinese product reviews
Computational Linguistics
FISA: feature-based instance selection for imbalanced text classification
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Using time topic modeling for semantics-based dynamic research interest finding
Knowledge-Based Systems
Imbalanced Sentiment Classification with Multi-strategy Ensemble Learning
IALP '11 Proceedings of the 2011 International Conference on Asian Language Processing
Semi-supervised learning for imbalanced sentiment classification
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Hi-index | 0.00 |
The vast subjective texts spreading all over the Internet promoted the demand for text sentiment classification technology. A well-known fact that often weakens the performance of classifiers is the distribution imbalance of review texts on the positive-negative classes. In this paper, we pay attention to the sentiment classification problem of imbalanced text sets. With regards to this problem, the algorithm BRC for clarifying the disorder boundary is proposed by cutting the majority class samples in the dense boundary region. The classifier is constructed based on Support Vector Machine. In order to find the better feature weight scheme, combination strategy of sample cutting, and parameters in BRC, three groups of experiments are designed on six text sets about five domains. The experimental results show that the feature weight scheme Presence has the best performance. And the combination strategy BRC+RS can give a tradeoff between the evaluation measures, Precision and Recall on two categories and make the synthetical evaluation measure Accuracy obtain a larger increase. It should be noted that the method of determining the parameters @a and @b in BRC is empirical. Although the boundary region cutting algorithm BRC is aimed to text sentiment classification we believe that it is also suitable to any two-category classification problem with imbalanced sample data.