The Journal of Machine Learning Research
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
A latent variable model for geographic lexical variation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
An analysis of perspectives in interactive settings
Proceedings of the First Workshop on Social Media Analytics
Discovering sociolinguistic associations with structured sparsity
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Language use as a reflection of socialization in online communities
LSM '11 Proceedings of the Workshop on Languages in Social Media
An efficient algorithm for topic ranking and modeling topic evolution
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Historical event extraction from text
LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Automatic verb extraction from historical Swedish texts
LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Topic modeling on historical newspapers
LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Semantic topic models: combining word distributional statistics and dictionary definitions
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Extraction of topic evolutions from references in scientific articles and its GPU acceleration
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
We propose a latent variable model to enhance historical analysis of large corpora. This work extends prior work in topic modelling by incorporating metadata, and the interactions between the components in metadata, in a general way. To test this, we collect a corpus of slavery-related United States property law judgements sampled from the years 1730 to 1866. We study the language use in these legal cases, with a special focus on shifts in opinions on controversial topics across different regions. Because this is a longitudinal data set, we are also interested in understanding how these opinions change over the course of decades. We show that the joint learning scheme of our sparse mixed-effects model improves on other state-of-the-art generative and discriminative models on the region and time period identification tasks. Experiments show that our sparse mixed-effects model is more accurate quantitatively and qualitatively interesting, and that these improvements are robust across different parameter settings.