Machine Learning
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Boosting to correct inductive bias in text classification
Proceedings of the eleventh international conference on Information and knowledge management
Combining Homogeneous Classifiers for Centroid-based Text Classification
ISCC '02 Proceedings of the Seventh International Symposium on Computers and Communications (ISCC'02)
Improving linear classifier for Chinese text categorization
Information Processing and Management: an International Journal
A novel refinement approach for text categorization
Proceedings of the 14th ACM international conference on Information and knowledge management
A Novel Text Classification Algorithm Based on Naïve Bayes and KL-Divergence
PDCAT '05 Proceedings of the Sixth International Conference on Parallel and Distributed Computing Applications and Technologies
Text Classification by Combining Grouping, LSA and kNN
ICIS-COMSAR '06 Proceedings of the 5th IEEE/ACIS International Conference on Computer and Information Science and 1st IEEE/ACIS International Workshop on Component-Based Software Engineering,Software Architecture and Reuse
Using hypothesis margin to boost centroid text classifier
Proceedings of the 2007 ACM symposium on Applied computing
A novel scheme for domain-transfer problem in the context of sentiment analysis
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
An improved centroid classifier for text categorization
Expert Systems with Applications: An International Journal
An Effective Approach to Enhance Centroid Classifier for Text Categorization
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Neighbor-weighted K-nearest neighbor for unbalanced text corpus
Expert Systems with Applications: An International Journal
An effective refinement strategy for KNN text classifier
Expert Systems with Applications: An International Journal
Enhanced centroid-based classification technique by filtering outliers
TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Using key sentence to improve sentiment classification
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
The impact of preprocessing on text classification
Information Processing and Management: an International Journal
Utilizing global and path information with language modelling for hierarchical text classification
Journal of Information Science
Hi-index | 12.05 |
In the community of information retrieval, Centroid Classifier has been showed to be a simple and yet effective method for text categorization. However, it is often plagued with model misfit (or inductive bias) incurred by its assumption. Various methods have been proposed to address this issue, such as Weight Adjustment, Voting, Refinement and DragPushing. However, existing methods employ only one criterion, i.e., training-set error. Researches in machine learning indicate that training-set error based method cannot guarantee the generalization capability of base classifiers for unseen examples. To overcome this problem, we propose a novel Model Adjustment algorithm, which makes use of training-set errors as well as training-set margins. Furthermore, we prove that for a linearly separable problem, proposed method converges to the optimal solution after finite updates using any learning parameter @h(@h0). The empirical assessment conducted on four benchmark collections indicates that proposed method performs slightly better than SVM classifier in prediction accuracy, as well as beats it in running time.