A maximum entropy approach to natural language processing
Computational Linguistics
Bridging the Gap: A Genre Analysis of Weblogs
HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04) - Track 4 - Volume 4
Video suggestion and discovery for youtube: taking random walks through the view graph
Proceedings of the 17th international conference on World Wide Web
Modeling latent biographic attributes in conversational genres
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Social (distributed) language modeling, clustering and dialectometry
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
Age and geographic inferences of the LiveJournal social network
ICML'06 Proceedings of the 2006 conference on Statistical network analysis
Networks, Crowds, and Markets: Reasoning About a Highly Connected World
Networks, Crowds, and Markets: Reasoning About a Highly Connected World
Distributed training strategies for the structured perceptron
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Inferring gender of movie reviewers: exploiting writing style, content and metadata
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Discovering sociolinguistic associations with structured sparsity
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Typed graph models for semi-supervised learning of name ethnicity
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Discriminating gender on Twitter
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
We consider the task of predicting the gender of the YouTube users and contrast two information sources: the comments they leave and the social environment induced from the affiliation graph of users and videos. We propagate gender information through the videos and show that a user's gender can be predicted from her social environment with the accuracy above 90%. We also show that the gender can be predicted from language alone (89%). A surprising result of our study is that the latter predictions correlate more strongly with the gender predominant in the user's environment than with the sex of the person as reported in the profile. We also investigate how the two views (linguistic and social) can be combined and analyse how prediction accuracy changes over different age groups.