Identifying user attributes through non-i.i.d. multi-instance learning

Authors:
Hyun-Je Song;Jeong-Woo Son;Seong-Bae Park
Affiliations:
Kyungpook National University, Daegu, Korea;Kyungpook National University, Daegu, Korea;Kyungpook National University, Daegu, Korea
Venue:
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Year:
2013

Citing 3
Cited 0

Multi-instance learning by treating instances as non-I.I.D. samples

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Classifying latent user attributes in twitter

SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
Democrats, republicans and starbucks afficionados: user classification in twitter

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

User attribute is an essential factor for personalized recommendation and targeted advertising. Therefore, there have been a number of studies to identify user attributes automatically from SNS postings, since the postings reveal various attributes of writers. Many kinds of machine learning methods have been applied to automatic identification of user attributes as a candidate solution, but they suffer from two major problems. First, there are many postings in SNS that do not deliver any information about writers. Then, learning from SNS postings results in a biased model by these irrelevant postings. Second, the postings of a SNS user are somewhat related one another. However, most machine learning methods ignore this information, since they assume that data are independently and identically distributed. In order to solve these problems in user attribute identification, this paper proposes a novel method based on non-i.i.d. multi-instance learning. Since multi-instance learning treats all postings by a user as a bag and learns user attribute identification with such bags, not with postings, the first problem is solved. In addition, the proposed method assumes that the postings by a single user have a structure. By incorporating this assumption into the multi-instance learning, the second problem is solved. Our experimental results show that consideration of these two problems in automatic user attribute identification results in performance improvement.