Making large-scale support vector machine learning practical
Advances in kernel methods
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
The TREC question answering track
Natural Language Engineering
The myth of the double-blind review?: author identification using only citations
ACM SIGKDD Explorations Newsletter
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
Comparative study of name disambiguation problem using a scalable blocking-based framework
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Training linear SVMs in linear time
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Why we twitter: understanding microblogging usage and communities
Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Make new friends, but keep the old: recommending people on social networking sites
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Query result clustering for object-level search
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
De-anonymizing Social Networks
SP '09 Proceedings of the 2009 30th IEEE Symposium on Security and Privacy
Personalized recommendation of social software items based on social relations
Proceedings of the third ACM conference on Recommender systems
Gathering and ranking photos of named entities with high precision, high recall, and diversity
Proceedings of the third ACM international conference on Web search and data mining
Ranking structural parameters for social networks
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
SocialSearch+: enriching social network with web evidences
World Wide Web
Hi-index | 0.00 |
This paper introduces the problem of matching people names to their corresponding social network identities such as their Twitter accounts. Existing tools for this purpose build upon naive textual matching and inevitably suffer low precision, due to false positives (e.g., fake impersonator accounts) and false negatives (e.g., accounts using nicknames). To overcome these limitations, we leverage "relational" evidences extracted from the Web corpus. In particular, as such an example, weadopt Web document co-occurrences, which can be interpreted as an "implicit" counterpart of Twitter follower relationships. Using both textual and relational features, we learn a ranking function aggregating these features for the accurate ordering of candidate matches. Another key contribution of this paper is to formulate confidence scoring as a separate problem from relevance ranking. A baseline approach is to use the relevance of the top match itself as the confidence score. In contrast, we train a separate classifier, using not only the top relevance score but also various statistical features extracted from the relevance scores of all candidates, and empirically validate to outperform the baseline approach. We evaluate our proposed system using real-life internetscale entity-relationship and social network graphs.