OCA: Opinion corpus for Arabic

  • Authors:
  • Mohammed Rushdi-Saleh;M. Teresa Martín-Valdivia;L. Alfonso Ureña-López;José M. Perea-Ortega

  • Affiliations:
  • SINAI Research Group, Computer Science Department, University of Jaén, 23071, Spain;SINAI Research Group, Computer Science Department, University of Jaén, 23071, Spain;SINAI Research Group, Computer Science Department, University of Jaén, 23071, Spain;SINAI Research Group, Computer Science Department, University of Jaén, 23071, Spain

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sentiment analysis is a challenging new task related to text mining and natural language processing. Although there are, at present, several studies related to this theme, most of these focus mainly on English texts. The resources available for opinion mining (OM) in other languages are still limited. In this article, we present a new Arabic corpus for the OM task that has been made available to the scientific community for research purposes. The corpus contains 500 movie reviews collected from different web pages and blogs in Arabic, 250 of them considered as positive reviews, and the other 250 as negative opinions. Furthermore, different experiments have been carried out on this corpus, using machine learning algorithms such as support vector machines and Nave Bayes. The results obtained are very promising and we are encouraged to continue this line of research. © 2011 Wiley Periodicals, Inc.