Structuring computer-mediated communication systems to avoid information overload
Communications of the ACM
Using collaborative filtering to weave an information tapestry
Communications of the ACM - Special issue on information filtering
Foundations of statistical natural language processing
Foundations of statistical natural language processing
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine Learning
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Automatic text categorization in terms of genre and author
Computational Linguistics
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
Social matching: A framework and research agenda
ACM Transactions on Computer-Human Interaction (TOCHI)
Utility scoring of product reviews
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search
ACM Transactions on Information Systems (TOIS)
Demographic prediction based on user's browsing behavior
Proceedings of the 16th international conference on World Wide Web
How opinions are received by online communities: a case study on amazon.com helpfulness votes
Proceedings of the 18th international conference on World wide web
Is the Crowd's Wisdom Biased? A Quantitative Analysis of Three Online Communities
CSE '09 Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 04
Terminology mining in social media
Proceedings of the 18th ACM conference on Information and knowledge management
Imagined communities: awareness, information sharing, and privacy on the facebook
PET'06 Proceedings of the 6th international conference on Privacy Enhancing Technologies
Democrats, republicans and starbucks afficionados: user classification in twitter
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Tracking sentiment in mail: how genders differ on emotional axes
WASSA '11 Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis
Learning the lingo?: gender, prestige and linguistic adaptation in review communities
Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work
From once upon a time to happily ever after: Tracking emotions in mail and books
Decision Support Systems
BlurMe: inferring and obfuscating user gender based on ratings
Proceedings of the sixth ACM conference on Recommender systems
User demographics and language in an implicit social network
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
A search engine approach to estimating temporal changes in gender orientation of first names
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Inferring the demographics of search users: social data meets search queries
Proceedings of the 22nd international conference on World Wide Web
User demographics prediction based on mobile data
Pervasive and Mobile Computing
Hi-index | 0.00 |
Despite differences in the way that men and women experience goods and communicate their perspectives, online review communities typically do not provide participants' gender. We propose to infer author gender, given a set of reviews of a particular item, and experiment on reviews posted at the Internet Movie Database (IMDb). Using logistic regression, we explore the contribution of three types of information: 1) style, 2) content, and 3) metadata (e.g. review age, social feedback). Our results concur with previous research, in that there are salient differences in writing style and content between reviews authored by men versus women. However, in comparison to literary or scientific texts, to which classification tasks are often applied, reviews are brief and occur within the context of an ongoing discourse. Therefore, to compensative for the brevity of reviews, content and stylistic features can be augmented with metadata. We find in particular that the perceived utility of a review is an important correlate of gender. The model incorporating all features has a classification accuracy of 73.7% and is not as sensitive to review length as are those based only on stylistic or content features.