Effective approaches to retrieving and using expertise in social media

Authors:
Reyyan Yeniterzi
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Year:
2013

Citing 1
Cited 0

Expertise Retrieval

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Expert retrieval has been widely studied especially after the introduction of Expert Finding task in the TREC's Enterprise Track in 2005 [3]. This track provided two different test collections crawled from two organizations' public-facing websites and internal emails which led to the development of many state-of-the-art algorithms on expert retrieval [1]. Until recently, these datasets were considered good representatives of the information resources available within enterprise. However, the recent growth of social media also influenced the work environment, and social media became a common communication and collaboration tool within organizations. According to a recent survey by McKinsey Global Institute [2], 29% of the companies use at least one social media tool for matching their employees to tasks, and 26% of them assess their employees' performance by using social media. This shows that intra-organizational social media became an important resource to identify expertise within organizations. In recent years, in addition to the intra-organizational social media, public social media tools like Twitter, Facebook, LinkedIn also became common environments for searching expertise. These tools provide an opportunity for their users to show their specific skills to the world which motivates recruiters to look for talented job candidates on social media, or writers and reporters to find experts for consulting on specific topics they are working on. With these motivations in mind, in this work we propose to develop expert retrieval algorithms for intra-organizational and public social media tools. Social media datasets have both challenges and advantages. In terms of challenges, they do not always contain context on one specific domain, instead one social media tool may contain discussions on technical stuff, hobbies or news concurrently. They may also contain spam posts or advertisements. Compared to well-edited enterprise documents, they are much more informal in language. Furthermore, depending on the social media platform, they may have limits on the number of characters used in posts. Even though they include the challenges stated above, they also bring some unique authority signals, such as votes, comments, follower/following information, which can be useful in estimating expertise. Furthermore, compared to previously used enterprise documents, social media provides clear associations between documents and candidates in the context of authorship information. In this work, we propose to develop expert retrieval approaches which will handle these challenges while making use of the advantages. Expert retrieval is a very useful application by itself; furthermore, it can be a step towards improving other social media applications. Social media is different than other web based tools mainly because it is dependent on its users. In social media, users are not just content consumers, but they are also the primary and sometimes the only content creators. Therefore, the quality of any user-generated content in social media depends on its creator. In this thesis, we propose to use expertise of users in order to improve the existing applications so that they can estimate the relevancy of a content not just based on the content, but also based on the expertise of the content creator. By using expertise of the content generator, we also hope to boost contents that are more reliable. We propose to apply this user's expertise information in order to improve ad-hoc search and question answering applications in social media. In this work, previous TREC enterprise datasets, available intra-organizational social media and public social media datasets will be used to test the proposed algorithms.