The Automatic Categorization of Arabic Documents by Boosting Decision Trees

Authors:
Saeed Raheel;Joseph Dichy;Mohamed Hassoun
Affiliations:
-;-;-
Venue:
SITIS '09 Proceedings of the 2009 Fifth International Conference on Signal Image Technology and Internet Based Systems
Year:
2009

Citing 0
Cited 1

An empirical study on the feature's type effect on the automatic classification of arabic documents

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic document classification has been subject to research since the early 1960s. However, additional research is still required and possible because the results obtained until now remain subject to further enhancement and refinement. Although a lot of literature has been written on the subject, very little research was reported on the automatic classification of Arabic documents none of which applied the technique of Boosting. In addition, Arabic is a highly inflective language and is morphologically much more complex than languages written with Latin characters. One cannot, therefore, easily take for granted that using Boosting to automatically classify Arabic documents is as effective as it is with documents written in Latin characters. This paper aims at exploring the technique of Boosting and its effectiveness with the automatic classification of Arabic documents and compares its performance with results obtained respectively with Support Vector Machines and Naïve Bayesian Networks.