Distributed feature extraction in a p2p setting: a case study

Authors:
Michael Wurst;Katharina Morik
Affiliations:
Artificial Intelligence Unit, Computer Science Department, University of Dortmund, Dortmund, Germany;Artificial Intelligence Unit, Computer Science Department, University of Dortmund, Dortmund, Germany
Venue:
Future Generation Computer Systems - Special section: Data mining in grid computing environments
Year:
2007

Citing 15
Cited 10

Automatic text processing

Automatic text processing
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Parallel and Distributed Association Mining: A Survey

IEEE Concurrency
Distributed Data Mining in Credit Card Fraud Detection

IEEE Intelligent Systems
Induction of Decision Trees

Machine Learning
Making gnutella-like P2P systems scalable

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Lightweight probabilistic broadcast

ACM Transactions on Computer Systems (TOCS)
Association Rule Mining in Peer-to-Peer Systems

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Grid-enabled data warehousing for molecular engineering

Parallel Computing - Special issue: High-performance parallel bio-computing
Automatic Feature Extraction for Classifying Audio Data

Machine Learning
Computing on large-scale distributed systems: Xtrem Web architecture, programming models, security, tests and convergence with grid

Future Generation Computer Systems - Special issue: P2P computing and interaction with grids
A super-peer model for resource discovery services in large-scale grids

Future Generation Computer Systems
Distributed Data Mining in Peer-to-Peer Networks

IEEE Internet Computing
Efficient case based feature construction

ECML'05 Proceedings of the 16th European conference on Machine Learning

Distributed Data Mining in Peer-to-Peer Networks

IEEE Internet Computing
An agent-based framework for distributed learning

Engineering Applications of Artificial Intelligence
A simulated annealing feature extraction approach for hyperspectral images

Future Generation Computer Systems
Introduction: the challenge of ubiquitous knowledge discovery

Ubiquitous knowledge discovery
Ubiquitous data

Ubiquitous knowledge discovery
Nemoz: a distributed framework for collaborative media organization

Ubiquitous knowledge discovery
Introduction: the challenge of ubiquitous knowledge discovery

Ubiquitous knowledge discovery
Ubiquitous data

Ubiquitous knowledge discovery
Nemoz: a distributed framework for collaborative media organization

Ubiquitous knowledge discovery
Context-aware collaborative data stream mining in ubiquitous devices

IDA'11 Proceedings of the 10th international conference on Advances in intelligent data analysis X

Quantified Score

Hi-index	0.00

Visualization

Abstract

Finding the right data representation is essential for virtually every data mining application. In this work we describe an approach to collaborative feature extraction, selection and aggregation in distributed, loosely coupled domains. In contrast to other work in the field of distributed data mining, we focus on scenarios in which a large number of loosely coupled nodes apply data mining to different, usually very small and overlapping, subsets of the entire data space. The aim is not to find a global concept to cover all data, but to learn a set of local concepts. Our prototypical application is a distributed media organization platform, called Nemoz, that assists users in maintaining their media collections. We propose two models for collaborative feature extraction, selection and aggregation for supervised data mining. One is based on a centralized p2p architecture, and the other on a fully distributed p2p architecture. We compare both models on a real word data set and discuss their advantages and problems.