WordNet: a lexical database for English
Communications of the ACM
Retrieval in text collections with historic spelling using linguistic and spelling variants
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Exploiting time-based synonyms in searching document archives
Proceedings of the 10th annual joint conference on Digital libraries
Using word sense discrimination on historic document collections
Proceedings of the 10th annual joint conference on Digital libraries
Measuring historical word sense variation
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Hi-index | 0.00 |
Knowing the behavior of terms in written texts can help us tailor fit models, algorithms and resources to improve access to digital libraries and help us answer information needs in longer spanning archives. In this paper we investigate the behavior of English written text in blogs in comparison to traditional texts from the New York Times, The Times Archive, and the British National Corpus. We show that user generated content, similar to spoken content, differs in characteristics from 'professionally' written text and experiences a more dynamic behavior.