Journal of the ACM (JACM)
Bursty and hierarchical structure in streams
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Information Theoretic Clustering of Sparse Co-Occurrence Data
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
P-TAG: large scale automatic generation of personalized annotation tags for the web
Proceedings of the 16th international conference on World Wide Web
Towards effective browsing of large scale social annotations
Proceedings of the 16th international conference on World Wide Web
Can social bookmarking improve web search?
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Improved search for socially annotated data
Proceedings of the VLDB Endowment
What is Twitter, a social network or a news media?
Proceedings of the 19th international conference on World wide web
Identifying, attributing and describing spatial bursts
Proceedings of the VLDB Endowment
Structural trend analysis for online social networks
Proceedings of the VLDB Endowment
Trend detection in folksonomies
SAMT'06 Proceedings of the First international conference on Semantic and Digital Media Technologies
Hi-index | 0.00 |
Online types of expression in the form of social networks, micro-blogging, blogs and rich content sharing platforms have proliferated in the last few years. Such proliferation contributed to the vast explosion in online data sharing we are experiencing today. One unique aspect of online data sharing is tags manually inserted by content generators to facilitate content description and discovery (e.g., hashtags in tweets). In this paper we focus on these tags and we study and propose algorithms that make use of tags in order to automatically organize and categorize this vast collection of socially contributed and tagged information. In particular, we take a holistic approach in organizing such tags and we propose algorithms to partition as well as rank this information collection. Our partitioning algorithms aim to segment the entire collection of tags (and the associated content) into a specified number of partitions for specific problem constraints. In contrast our ranking algorithms aim to identify few partitions fast, for suitably defined ranking functions. We present a detailed experimental study utilizing the full twitter firehose (set of all tweets in the Twitter service) that attests to the practical utility and effectiveness of our overall approach. We also present a detailed qualitative study of our results.