You're not from 'round here, are you?: naive Bayes detection of non-native utterance text
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
The form is the substance: classification of genres in text
HLTKM '01 Proceedings of the workshop on Human Language Technology and Knowledge Management - Volume 2001
Hurrian orthographic interference in nuzi akkadian: a computational comparative graphemic analysis
Hurrian orthographic interference in nuzi akkadian: a computational comparative graphemic analysis
Hi-index | 0.00 |
This paper presents the problem within Hittite and Ancient Near Eastern studies of fragmented and damaged cuneiform texts, and proposes to use well-known text classification metrics, in combination with some facts about the structure of Hittite-language cuneiform texts, to help classify a number of fragments of clay cuneiform-script tablets into more complete texts. In particular, I propose using Sumerian and Akkadian ideogrammatic signs within Hittite texts to improve the performance of Naive Bayes and Maximum Entropy classifiers. The performance in some cases is improved, and in some cases very much not, suggesting that the variable frequency of occurrence of these ideograms in individual fragments makes considerable difference in the ideal choice for a classification method. Further, complexities of the writing system and the digital availability of Hittite texts complicate the problem.