Understanding the thematic structure of the Qur'an: an exploratory multivariate approach

Authors:
Naglaa Thabet
Affiliations:
University of Newcastle, Newcastle upon Tyne, UK
Venue:
ACLstudent '05 Proceedings of the ACL Student Research Workshop
Year:
2005

Citing 6
Cited 1

Multivariate data analysis (4th ed.): with readings

Multivariate data analysis (4th ed.): with readings
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Data clustering: a review

ACM Computing Surveys (CSUR)
Principles of data mining

Principles of data mining
Cluster Analysis

Cluster Analysis
Stemming the Qur'an

Semitic '04 Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages

Sura Length and Lexical Probability Estimation in Cluster Analysis of the Qur’an

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we develop a methodology for discovering the thematic structure of the Qur'an based on a fundamental idea in data mining and related disciplines: that, with respect to some collection of texts, the lexical frequency profiles of the individual texts are a good indicator of their conceptual content, and thus provide a reliable criterion for their classification relative to one another. This idea is applied to the discovery of thematic interrelationships among the suras (chapters) of the Qur'an by abstracting lexical frequency data from them and then applying hierarchical cluster analysis to that data. The results reported here indicate that the proposed methodology yields usable results in understanding the Qur'an on the basis of its lexical semantics.