Self-supervised mining of human activity from CGM

Authors:
Nguyen Minh The;Takahiro Kawamura;Hiroyuki Nakagawa;Yasuyuki Tahara;Akihiko Ohsuga
Affiliations:
Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan
Venue:
PKAW'10 Proceedings of the 11th international conference on Knowledge management and acquisition for smart systems and services
Year:
2010

Citing 16
Cited 0

Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Japanese morphological analyzer using word co-occurrence: JTAG

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Mining models of human activities from the web

Proceedings of the 13th international conference on World Wide Web
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Japanese dependency analysis using cascaded chunking

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Discovering Association Rules on Experiences from Large-Scale Blog Entries

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Methods for domain-independent information extraction from the web: an experimental comparison

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Sensor-based understanding of daily life via large-scale use of common sense

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Learning large scale common sense models of everyday life

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Inferring long-term user properties based on users' location history

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Ubiquitous Computing: Smart Devices, Environments and Interactions

Ubiquitous Computing: Smart Devices, Environments and Interactions

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of this paper is to describe a method to automatically extract all basic attributes namely actor, action, object, time and location which belong to an activity, and the transition between activities in each sentence retrieved from Japanese CGM (consumer generated media). Previous work had some limitations, such as high setup cost, inability of extracting all attributes, limitation on the types of sentences that can be handled, and insufficient consideration of interdependency among attributes. To resolve these problems, this paper proposes a novel approach that treats the activity extraction as a sequence labeling problem, and automatically makes its own training data. This approach has advantages such as domain-independence, scalability, and unnecessary hand-tagged data. Since it is unnecessary to fix the positions and the number of the attributes in activity sentences, this approach can extract all attributes and transitions between activities by making only a single pass over its corpus.