SocialSearch+: enriching social network with web evidences

Authors:
Gae-Won You;Jin-Woo Park;Seung-Won Hwang;Zaiqing Nie;Ji-Rong Wen
Affiliations:
Pohang University of Science and Technology, Pohang, Republic of Korea;Pohang University of Science and Technology, Pohang, Republic of Korea;Pohang University of Science and Technology, Pohang, Republic of Korea;Microsoft Research Asia, Beijing, People's Republic of China;Microsoft Research Asia, Beijing, People's Republic of China
Venue:
World Wide Web
Year:
2013

Citing 20
Cited 0

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
The myth of the double-blind review?: author identification using only citations

ACM SIGKDD Explorations Newsletter
Disambiguating Web appearances of people in a social network

WWW '05 Proceedings of the 14th international conference on World Wide Web
Comparative study of name disambiguation problem using a scalable blocking-based framework

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Why we twitter: understanding microblogging usage and communities

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
On Capturing Semantics in Ontology Mapping

World Wide Web
Robust De-anonymization of Large Sparse Datasets

SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
Make new friends, but keep the old: recommending people on social networking sites

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
StatSnowball: a statistical approach to extracting entity relationships

Proceedings of the 18th international conference on World wide web
Query result clustering for object-level search

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
On social networks and collaborative recommendation

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
De-anonymizing Social Networks

SP '09 Proceedings of the 2009 30th IEEE Symposium on Security and Privacy
Personalized recommendation of social software items based on social relations

Proceedings of the third ACM conference on Recommender systems
Gathering and ranking photos of named entities with high precision, high recall, and diversity

Proceedings of the third ACM international conference on Web search and data mining
LINKREC: a unified framework for link recommendation with user attributes and graph structure

Proceedings of the 19th international conference on World wide web
Supervised random walks: predicting and recommending links in social networks

Proceedings of the fourth ACM international conference on Web search and data mining
SocialSearch: enhancing entity search with social network matching

Proceedings of the 14th International Conference on Extending Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces the problem of searching for social network accounts, e.g., Twitter accounts, with the rich information available on the Web, e.g., people names, attributes, and relationships to other people. For this purpose, we need to map Twitter accounts with Web entities. However, existing solutions building upon naive textual matching inevitably suffer low precision due to false positives (e.g., fake impersonator accounts) and false negatives (e.g., accounts using nicknames). To overcome these limitations, we leverage "relational" evidences extracted from the Web corpus. We consider two types of evidence resources--First, web-scale entity relationship graphs, extracted from name co-occurrences crawled from the Web. This co-occurrence relationship can be interpreted as an "implicit" counterpart of Twitter follower relationships. Second, web-scale relational repositories, such as Freebase with complementary strength. Using both textual and relational features obtained from these resources, we learn a ranking function aggregating these features for the accurate ordering of candidate matches. Another key contribution of this paper is to formulate confidence scoring as a separate problem from relevance ranking. A baseline approach is to use the relevance of the top match itself as the confidence score. In contrast, we train a separate classifier, using not only the top relevance score but also various statistical features extracted from the relevance scores of all candidates, and empirically validate that our approach outperforms the baseline approach. We evaluate our proposed system using real-life internet-scale entity-relationship and social network graphs.