Combining crowd-generated media and personal data: semi-supervised learning for context recognition

Authors:
Long-Van Nguyen-Dinh;Mirco Rossi;Ulf Blanke;Gerhard Tröster
Affiliations:
ETH Zurich, Zurich, Switzerland;ETH Zurich, Zurich, Switzerland;ETH Zurich, Zurich, Switzerland;ETH Zurich, Zurich, Switzerland
Venue:
Proceedings of the 1st ACM international workshop on Personal data meets distributed multimedia
Year:
2013

Citing 6
Cited 1

Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
SoundButton: Design of a Low Power Wearable Audio Classification System

ISWC '03 Proceedings of the 7th IEEE International Symposium on Wearable Computers
Mining models of human activities from the web

Proceedings of the 13th international conference on World Wide Web
A survey of mobile phone sensing

IEEE Communications Magazine
Audio-based context recognition

IEEE Transactions on Audio, Speech, and Language Processing
Recognizing Daily Life Context Using Web-Collected Audio Data

ISWC '12 Proceedings of the 2012 16th Annual International Symposium on Wearable Computers (ISWC)

Towards scalable activity recognition: adapting zero-effort crowdsourced acoustic models

Proceedings of the 12th International Conference on Mobile and Ubiquitous Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

The growing ubiquity of sensors in mobile phones has opened many opportunities for personal daily activity sensing. Most context recognition systems require a cumbersome preparation by collecting and manually annotating training examples. Recently, mining online crowd-generated repositories for free annotated training data has been proposed to build context models. A crowd-generated dataset can capture a large variety both in terms of class number and in intra-class diversity, but may not cover all user-specific contexts. Thus, performance is often significantly worse than that of user-centric training. In this work, we exploit for the first time the combination of both crowd-generated audio dataset available in the web and unlabeled audio data obtained from users' mobile phones. We use a semi-supervised Gaussian mixture model to combine labeled data from the crowd-generated database and unlabeled personal recording data. Hereby we refine generic knowledge with data from the user to train a personalized model. This technique has been tested on 7 users on mobile phones with a total data of 14 days and up to 9 context classes. Preliminary results show that a semi-supervised model can improve the recognition accuracy up to 21%.