Distance metrics for high dimensional nearest neighborhood recovery: Compression and normalization
Information Sciences: an International Journal
Identifying the semantic orientation of terms using S-HAL for sentiment analysis
Knowledge-Based Systems
Sentimental Spidering: Leveraging Opinion Information in Focused Crawlers
ACM Transactions on Information Systems (TOIS)
Document-level sentiment classification: An empirical comparison between SVM and ANN
Expert Systems with Applications: An International Journal
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Product Comparison Networks for Competitive Analysis of Online Word-of-Mouth
ACM Transactions on Management Information Systems (TMIS)
Optimal feature selection for sentiment analysis
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
A weakly supervised approach to Chinese sentiment classification using partitioned self-training
Journal of Information Science
Hi-index | 0.00 |
A major concern when incorporating large sets of diverse n-gram features for sentiment classification is the presence of noisy, irrelevant, and redundant attributes. These concerns can often make it difficult to harness the augmented discriminatory potential of extended feature sets. We propose a rule-based multivariate text feature selection method called Feature Relation Network (FRN) that considers semantic information and also leverages the syntactic relationships between n-gram features. FRN is intended to efficiently enable the inclusion of extended sets of heterogeneous n-gram features for enhanced sentiment classification. Experiments were conducted on three online review testbeds in comparison with methods used in prior sentiment classification research. FRN outperformed the comparison univariate, multivariate, and hybrid feature selection methods; it was able to select attributes resulting in significantly better classification accuracy irrespective of the feature subset sizes. Furthermore, by incorporating syntactic information about n-gram relations, FRN is able to select features in a more computationally efficient manner than many multivariate and hybrid techniques.