Identifying hidden contexts in classification

Authors:
Indre Žliobaite
Affiliations:
Eindhoven University of Technology, Eindhoven, The Netherlands and Smart Technology Research Center, Bournemouth University, Poole, UK
Venue:
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Year:
2011

Citing 7
Cited 2

Learning in the presence of concept drift and hidden contexts

Machine Learning
Extracting Hidden Context

Machine Learning - Special issue on context sensitivity and concept drift
When Is ''Nearest Neighbor'' Meaningful?

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Context in problem solving: a survey

The Knowledge Engineering Review
Tracking recurring contexts using ensemble classifiers: an application to email filtering

Knowledge and Information Systems
Filter-Based Data Partitioning for Training Multiple Classifier Systems

IEEE Transactions on Knowledge and Data Engineering

Context mining and integration into predictive web analytics

Proceedings of the 22nd international conference on World Wide Web companion
Discovering temporal hidden contexts in web sessions for user trail prediction

Proceedings of the 22nd international conference on World Wide Web companion

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this study we investigate how to identify hidden contexts from the data in classification tasks. Contexts are artifacts in the data, which do not predict the class label directly. For instance, in speech recognition task speakers might have different accents, which do not directly discriminate between the spoken words. Identifying hidden contexts is considered as data preprocessing task, which can help to build more accurate classifiers, tailored for particular contexts and give an insight into the data structure. We present three techniques to identify hidden contexts, which hide class label information from the input data and partition it using clustering techniques. We form a collection of performance measures to ensure that the resulting contexts are valid. We evaluate the performance of the proposed techniques on thirty real datasets. We present a case study illustrating how the identified contexts can be used to build specialized more accurate classifiers.