International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
The induction of probabilistic rule sets—the Itrule algorithm
Proceedings of the sixth international workshop on Machine learning
Original Contribution: Stacked generalization
Neural Networks
C4.5: programs for machine learning
C4.5: programs for machine learning
Duality aspects of the Gini index for general information production processes
Information Processing and Management: an International Journal - Special issue on Informetrics
On the expressive power of query languages
ACM Transactions on Information Systems (TOIS)
Finding interesting rules from large sets of discovered association rules
CIKM '94 Proceedings of the third international conference on Information and knowledge management
Multivariate data analysis (4th ed.): with readings
Multivariate data analysis (4th ed.): with readings
The KDD process for extracting useful knowledge from volumes of data
Communications of the ACM
On the Accuracy of Meta-learning for Scalable Data Mining
Journal of Intelligent Information Systems
SiteHelper: a localized agent that helps incremental exploration of the World Wide Web
Selected papers from the sixth international conference on World Wide Web
Data Compression and Local Metrics for Nearest Neighbor Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
An introduction to deductive database languages and systems
The VLDB Journal — The International Journal on Very Large Data Bases - Prototypes of deductive database systems
IEEE Transactions on Knowledge and Data Engineering
Efficient Mining of Association Rules in Distributed Databases
IEEE Transactions on Knowledge and Data Engineering
Visualization Support for Data Mining
IEEE Expert: Intelligent Systems and Their Applications
Learning Logical Definitions from Relations
Machine Learning
Machine Learning
Data-Driven Discovery of Quantitative Rules in Relational Databases
IEEE Transactions on Knowledge and Data Engineering
Inductive Learning in Deductive Databases
IEEE Transactions on Knowledge and Data Engineering
Knowledge Discovery from Telecommunication Network Alarm Databases
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Mining Knowledge Rules from Databases: A Rough Set Approach
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Modeling Uncertainty in Deductive Databases
DEXA '94 Proceedings of the 5th International Conference on Database and Expert Systems Applications
Finding the Most Similar Documents across Multiple Text Databases
ADL '99 Proceedings of the IEEE Forum on Research and Technology Advances in Digital Libraries
Novel parallel join algorithms for grid files
HIPC '96 Proceedings of the Third International Conference on High-Performance Computing (HiPC '96)
Speech recognition in parallel
HLT '89 Proceedings of the workshop on Speech and Natural Language
Improved use of continuous attributes in C4.5
Journal of Artificial Intelligence Research
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Multirelational classification: a multiple view approach
Knowledge and Information Systems
Hi-index | 0.00 |
Many successful data-mining techniques and systems have been developed. These techniques usually apply to centralized databases with less restricted requirements on learning and response time. Not so much effort has yet been put into mining distributed databases and real-time issues. In this paper, we investigate issues of fast-distributed data mining. We assume that merging the distributed databases into a single one would either be too costly (distributed case) or the individual fragments would be non-uniform so that mining only one fragment would bias the result (fragmented case). The goal is to classify the objects O of the database into one of several mutually exclusive classes Ci. Our approach to make mining fast and feasible is as follows. From each data site or fragment dbk, only a single rule rik is generated for each class Ci. A small subset {ri1,.....,rih} of these individual rules is selected to form a rule set Ri for each class Ci. These rule subsets represent adequately the hidden knowledge of the entire database. Various selection criteria to form Ri are discussed, both theoretically and experimentally.