Learning similarity metrics for event identification in social media

Authors:
Hila Becker;Mor Naaman;Luis Gravano
Affiliations:
Columbia University, New York, NY, USA;Rutgers University, New Brunswick, NJ, USA;Columbia University, New York, NY, USA
Venue:
Proceedings of the third ACM international conference on Web search and data mining
Year:
2010

Citing 34
Cited 48

The merge/purge problem for large databases

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Multilevel hypergraph partitioning: application in VLSI domain

DAC '97 Proceedings of the 34th annual Design Automation Conference
A study of retrospective and on-line event detection

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On-line new event detection and tracking

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
An investigation of linguistic features and clustering algorithms for topical document clustering

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Efficient clustering of high-dimensional data sets with application to reference matching

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning Approaches for Detecting and Tracking News Events

IEEE Intelligent Systems
Introduction to topic detection and tracking

Topic detection and tracking
Learning to match and cluster large high-dimensional data sets for data integration

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Adaptive duplicate detection using learnable string similarity measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Simple Semantics in Topic Detection and Tracking

Information Retrieval
Text classification and named entities for new event detection

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive Product Normalization: Using Online Learning for Record Linkage in Comparison Shopping

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Adaptive Blocking: Learning to Scale Up Record Linkage

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Information-theoretic metric learning

Proceedings of the 24th international conference on Machine learning
Towards automatic extraction of event and place semantics from flickr tags

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
New event detection based on indexing-tree and named entity

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
How flickr helps us make sense of the world: context and content in community-contributed media collections

Proceedings of the 15th international conference on Multimedia
Finding high-quality content in social media

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Can social bookmarking improve web search?

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Web video topic discovery and tracking via bipartite graph reinforcement model

Proceedings of the 17th international conference on World Wide Web
Social tag prediction

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval

Introduction to Information Retrieval
Efficient network aware search in collaborative tagging sites

Proceedings of the VLDB Endowment
Weighted cluster ensembles: Methods and analysis

ACM Transactions on Knowledge Discovery from Data (TKDD)
Less talk, more rock: automated organization of community-contributed collections of concert videos

Proceedings of the 18th international conference on World wide web
A comparison of extrinsic clustering evaluation metrics based on formal constraints

Information Retrieval
Exploiting context analysis for combining multiple entity resolution systems

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Event detection from flickr data through wavelet-based spatial analysis

Proceedings of the 18th ACM conference on Information and knowledge management
Introduction to Applied Optimization

Introduction to Applied Optimization

Bringing order to your photos: event-driven classification of flickr images based on social knowledge

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Topic discovery of web video using star-structured K-partite graph

Proceedings of the international conference on Multimedia
ClustTour: city exploration by use of hybrid photo clustering

Proceedings of the international conference on Multimedia
Detecting events by clustering videos from large media databases

Proceedings of the 2nd ACM international workshop on Events in multimedia
Dynamic relationship and event discovery

Proceedings of the fourth ACM international conference on Web search and data mining
Linking online news and social media

Proceedings of the fourth ACM international conference on Web search and data mining
Semantic analysis and retrieval in personal and social photo collections

Multimedia Tools and Applications
Social tags as news event detectors

Journal of Information Science
Intelligent assistance for conversational storytelling using story patterns

Proceedings of the 16th international conference on Intelligent user interfaces
EnBlogue: emergent topic detection in web 2.0 streams

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Finding media illustrating events

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
An event-centric model for multilingual document similarity

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Clustering geo-tagged photo collections using dynamic programming

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Using social media to identify events

WSM '11 Proceedings of the 3rd ACM SIGMM international workshop on Social media
A unified framework for web video topic discovery and visualization

Pattern Recognition Letters
Identifying content for planned events across social media sites

Proceedings of the fifth ACM international conference on Web search and data mining
Social multimedia: highlighting opportunities for search and mining of multimedia data in social media applications

Multimedia Tools and Applications
Causal relation detection for activities from heterogeneous sources

ICWE'11 Proceedings of the 11th international conference on Current Trends in Web Engineering
Automatic sub-event detection in emergency management using social media

Proceedings of the 21st international conference companion on World Wide Web
See what's enBlogue: real-time emergent topic identification in social media

Proceedings of the 15th International Conference on Extending Database Technology
In & out zooming on time-aware user/tag clusters

Journal of Intelligent Information Systems
Event-based classification of social media streams

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Unsupervised and supervised learning to evaluate event relatedness based on content mining from social-media streams

Expert Systems with Applications: An International Journal
Who is Retweeting the Tweeters? Modeling, Originating, and Promoting Behaviors in the Twitter Network

ACM Transactions on Management Information Systems (TMIS)
Learning to explore spatio-temporal impacts for event evaluation on social media

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part II
Detection of photos from the same event captured by distinct cameras

Proceedings of the 18th Brazilian symposium on Multimedia and the web
Social event detection with interaction graph modeling

Proceedings of the 20th ACM international conference on Multimedia
Predicting participants in public events using stock photos

Proceedings of the 20th ACM international conference on Multimedia
RssE-Miner: a new approach for efficient events mining from social media RSS feeds

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Trend makers and trend spotters in a mobile application

Proceedings of the 2013 conference on Computer supported cooperative work
Automatic clustering for digital photograph collections using time and content information

Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Enterprise Wisdom Captured Socially

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Social event detection with robust high-order co-clustering

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Social events and social ties

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Jointly exploiting visual and non-visual information for event-related social media retrieval

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Exploring temporal proximity and spatial distribution of terms in web-based search of event-related images

Proceedings of the 24th ACM Conference on Hypertext and Social Media
Emerging topic detection for organizations from microblogs

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Identifying local events by using microblogs as social sensors

Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Clustering memes in social media

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
On the utility of abstraction in labeling actors in social networks

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
An architecture for detecting events in real-time using massive heterogeneous data sources

Proceedings of the 2nd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
How the live web feels about events

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Personalized emerging topic detection based on a term aging model

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
A Graph Analytical Approach for Topic Detection

ACM Transactions on Internet Technology (TOIT)
Discovering common motifs in cursor movement data for improving web search

Proceedings of the 7th ACM international conference on Web search and data mining
Combining supervised and unsupervised models via unconstrained probabilistic embedding

Information Sciences: an International Journal
ReSEED: social event dEtection dataset

Proceedings of the 5th ACM Multimedia Systems Conference
External validity of sentiment mining reports: Can current methods identify demographic biases, event biases, and manipulation of reviews?

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Social media sites (e.g., Flickr, YouTube, and Facebook) are a popular distribution outlet for users looking to share their experiences and interests on the Web. These sites host substantial amounts of user-contributed materials (e.g., photographs, videos, and textual content) for a wide variety of real-world events of different type and scale. By automatically identifying these events and their associated user-contributed social media documents, which is the focus of this paper, we can enable event browsing and search in state-of-the-art search engines. To address this problem, we exploit the rich "context" associated with social media content, including user-provided annotations (e.g., title, tags) and automatically generated information (e.g., content creation time). Using this rich context, which includes both textual and non-textual features, we can define appropriate document similarity metrics to enable online clustering of media to events. As a key contribution of this paper, we explore a variety of techniques for learning multi-feature similarity metrics for social media documents in a principled manner. We evaluate our techniques on large-scale, real-world datasets of event images from Flickr. Our evaluation results suggest that our approach identifies events, and their associated social media documents, more effectively than the state-of-the-art strategies on which we build.