Cognitively motivated features for readability assessment
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Using Temporal Language Models for Document Dating
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
When was it written? automatically determining publication dates
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Longitudinal analysis of historical texts' readability
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Hi-index | 0.00 |
Recently many historical texts have become digitized and made accessible for search and browsing. As human language is subject to constant evolution, these texts pose varying challenges to current users. In this paper we report the results of large-scale studies on the usage of words and the evolution of English language vocabulary over the last two centuries to help with understanding its impact on readability and retrieval of historical documents. We perform analysis of several lexical factors which may influence accessibility and readability of historical texts based on two large scale lexical corpora: the Corpus of Historical American English and Google Books 1-gram.