Evaluating text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Characterization and detection of noise in clustering
Pattern Recognition Letters
Original Contribution: Stacked generalization
Neural Networks
Machine Learning
An algorithm for suffix stripping
Readings in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Learning to extract symbolic knowledge from the World Wide Web
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
MailCat: an intelligent assistant for organizing e-mail
Proceedings of the third annual conference on Autonomous Agents
An adaptive version of the boost by majority algorithm
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Proceedings of the tenth international conference on Information and knowledge management
Distributed clustering using collective principal component analysis
Knowledge and Information Systems
Modern Information Retrieval
Methods and metrics for cold-start recommendations
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Mining the Web: Discovering Knowledge from HyperText Data
Mining the Web: Discovering Knowledge from HyperText Data
MyLifeBits: fulfilling the Memex vision
Proceedings of the tenth ACM international conference on Multimedia
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Challenges of the Email Domain for Text Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Combination Scheme for Fuzzy Clustering
AFSS '02 Proceedings of the 2002 AFSS International Conference on Fuzzy Systems. Calcutta: Advances in Soft Computing
Heterogeneous Learner for Web Page Classification
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Data Clustering Using Evidence Accumulation
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Stuff I've seen: a system for personal information retrieval and re-use
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
An extensible meta-learning approach for scalable and accurate inductive learning
An extensible meta-learning approach for scalable and accurate inductive learning
Classification with Reject Option in Text Categorisation Systems
ICIAP '03 Proceedings of the 12th International Conference on Image Analysis and Processing
Reducing multiclass to binary: a unifying approach for margin classifiers
The Journal of Machine Learning Research
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Combining Multiple Weak Clusterings
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Privacy-preserving Distributed Clustering using Generative Models
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
The perfect search engine is not enough: a study of orienteering behavior in directed search
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Document clustering via adaptive subspace iteration
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Restrictive clustering and metaclustering for self-organizing document collections
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Solving cluster ensemble problems by bipartite graph partitioning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Goal-oriented methods and meta methods for document classification and their parameter tuning
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Feature selection with conditional mutual information maximin in text categorization
Proceedings of the thirteenth ACM international conference on Information and knowledge management
TaskTracer: a desktop environment to support multi-tasking knowledge workers
Proceedings of the 10th international conference on Intelligent user interfaces
Improving collection selection with overlap awareness in P2P search engines
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
MyLifeBits: a personal database for everything
Communications of the ACM - Personal information management
Dogear: Social bookmarking in the enterprise
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Fast, flexible filtering with phlat
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
ACM SIGMOD Record
Ethnographic study of collaborative knowledge work
IBM Systems Journal
Algorithms for clustering high dimensional and distributed data
Intelligent Data Analysis
Management Information Systems for the Information Age
Management Information Systems for the Information Age
The weighted majority algorithm
SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science
PINTS: peer-to-peer infrastructure for tagging systems
IPTPS'08 Proceedings of the 7th international conference on Peer-to-peer systems
ECML'06 Proceedings of the 17th European conference on Machine Learning
A survey of schema-based matching approaches
Journal on Data Semantics IV
Using restrictive classification and meta classification for junk elimination
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Automatic document organization in a p2p environment
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
P2PDocTagger: content management through automated P2P collaborative tagging
Proceedings of the VLDB Endowment
Data sharing in networked environments: organization, platforms and issues
CIT'11 Proceedings of the 5th WSEAS international conference on Communications and information technology
Hi-index | 0.00 |
This article introduces a methodology for automatically organizing document collections into thematic categories for Personal Information Management (PIM) through collaborative sharing of machine learning models in an efficient and privacy-preserving way. Our objective is to combine multiple independently learned models from several users to construct an advanced ensemble-based decision model by taking the knowledge of multiple users into account in a decentralized manner, for example, in a peer-to-peer overlay network. High accuracy of the corresponding supervised (classification) and unsupervised (clustering) methods is achieved by restrictively leaving out uncertain documents rather than assigning them to inappropriate topics or clusters with low confidence. We introduce a formal probabilistic model for the resulting ensemble based meta methods and explain how it can be used for constructing estimators and for goal-oriented tuning. Comprehensive evaluation results on different reference data sets illustrate the viability of our approach.