Whose thumb is it anyway?: classifying author personality from weblog text
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Using linguistic cues for the automatic recognition of personality in conversation and text
Journal of Artificial Intelligence Research
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Geographical and organizational distances in enterprise crowdfunding
Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
Hi-index | 0.00 |
With recent research interest in the confounding roles of homophily and contagion in studies of social influence, there is a strong need for reliable content-based measures of the similarity between people. In this paper, we investigate the use of text similarity measures as a way of predicting the similarity of prolific weblog authors. We describe a novel method of collecting human judgments of overall similarity between two authors, as well as demographic, political, cultural, religious, values, hobbies/interests, personality, and writing style similarity. We then apply a range of automated textual similarity measures based on word frequency counts, and calculate their statistical correlation with human judgments. Our findings indicate that commonly used text similarity measures do not correlate well with human judgments of author similarity. However, various measures that pay special attention to personal pronouns and their context correlate significantly with different facets of similarity.