Human activity mining using conditional radom fields and self-supervised learning

Authors:
Nguyen Minh The;Takahiro Kawamura;Hiroyuki Nakagawa;Ken Nakayama;Yasuyuki Tahara;Akihiko Ohsuga
Affiliations:
Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan and Institute for Mathematics and Computer Science, Tsuda College, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan;Graduate School of Information Systems, The University of Electro-Communications, Tokyo, Japan
Venue:
ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part I
Year:
2010

Citing 15
Cited 2

Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Japanese morphological analyzer using word co-occurrence: JTAG

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Mining models of human activities from the web

Proceedings of the 13th international conference on World Wide Web
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Japanese dependency analysis using cascaded chunking

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Discovering Association Rules on Experiences from Large-Scale Blog Entries

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Methods for domain-independent information extraction from the web: an experimental comparison

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Proceedings of the 3d International Conference on Online Communities and Social Computing: Held as Part of HCI International 2009

OCSC '09 Proceedings of the 3d International Conference on Online Communities and Social Computing: Held as Part of HCI International 2009
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Ubiquitous Computing: Smart Devices, Environments and Interactions

Ubiquitous Computing: Smart Devices, Environments and Interactions
Open information extraction for the web

Open information extraction for the web

Capturing users' buying activity at Akihabara electric town from twitter

ICCCI'10 Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume Part II
Self-supervised capturing of users' activities from weblogs

International Journal of Intelligent Information and Database Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In our definition, human activity can be expressed by five basic attributes: actor, action, object, time and location. The goal of this paper is describe a method to automatically extract all of the basic attributes and the transition between activities derived from sentences in Japanese web pages. However, previous work had some limitations, such as high setup costs, inability to extract all attributes, limitation on the types of sentences that can be handled, and insufficient consideration interdependency among attributes. To resolve these problems, this paper proposes a novel approach that uses conditional random fields and self-supervised learning. This approach treats activity extraction as a sequence labeling problem, and has advantages such as domain-independence, scalability, and does not require any human input. In an experiment, this approach achieves high precision (activity: 88.9%, attributes: over 90%, transition: 87.5%).