Automatic verb extraction from historical Swedish texts

Authors:
Eva Pettersson;Joakim Nivre
Affiliations:
Uppsala University;Uppsala University
Venue:
LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Year:
2011

Citing 3
Cited 2

TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Natural Language Processing Across Time: An Empirical Investigation on Italian

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
HunPos: an open source trigram tagger

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

Parsing the past: identification of verb constructions in historical text

LaTeCH '12 Proceedings of the 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Historical analysis of legal opinions with a sparse mixed-effects latent variable model

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Even though historical texts reveal a lot of interesting information on culture and social structure in the past, information access is limited and in most cases the only way to find the information you are looking for is to manually go through large volumes of text, searching for interesting text segments. In this paper we will explore the idea of facilitating this time-consuming manual effort, using existing natural language processing techniques. Attention is focused on automatically identifying verbs in early modern Swedish texts (1550--1800). The results indicate that it is possible to identify linguistic categories such as verbs in texts from this period with a high level of precision and recall, using morphological tools developed for present-day Swedish, if the text is normalised into a more modern spelling before the morphological tools are applied.