Quantifying inductive bias: AI learning algorithms and Valiant's learning framework
Artificial Intelligence
Federated database systems for managing distributed, heterogeneous, and autonomous databases
ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
The Use of Background Knowledge in Decision Tree Induction
Machine Learning
Guiding induction with domain theories
Machine learning
The Utility of Knowledge in Inductive Learning
Machine Learning
Knowledge-based artificial neural networks
Artificial Intelligence
Machine Learning
Efficient maintenance of materialized mediated views
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The TSIMMIS Approach to Mediation: Data Models and Languages
Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
Attribute-oriented induction in data mining
Advances in knowledge discovery and data mining
Managing semantic heterogeneity in databases: a theoretical prospective
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Machine Learning - Special issue on learning with probabilistic representations
Efficient noise-tolerant learning from statistical queries
Journal of the ACM (JACM)
Mind your vocabulary: query mapping across heterogeneous information sources
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Knowledge representation: logical, philosophical and computational foundations
Knowledge representation: logical, philosophical and computational foundations
MOCHA: a self-extensible database middleware system for distributed data sources
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Bioinformatics: the machine learning approach
Bioinformatics: the machine learning approach
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Machine Learning
Relational Data Mining
Logic-based techniques in data integration
Logic-based artificial intelligence
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
Parallel Formulations of Decision-Tree Classification Algorithms
Data Mining and Knowledge Discovery
IEEE Transactions on Knowledge and Data Engineering
Evaluating Aggregate Operations Over Imprecise Data
IEEE Transactions on Knowledge and Data Engineering
Scaling Access to Heterogeneous Data Sources with DISCO
IEEE Transactions on Knowledge and Data Engineering
Aggregation of Imprecise and Uncertain Information in Databases
IEEE Transactions on Knowledge and Data Engineering
The Conceptual Basis for Mediation Services
IEEE Expert: Intelligent Systems and Their Applications
Machine Learning
Abstract-Driven Pattern Discovery in Databases
IEEE Transactions on Knowledge and Data Engineering
The Nimble XML Data Integration System
Proceedings of the 17th International Conference on Data Engineering
Knowledge Acquisition form Examples Vis Multiple Models
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Using Feature Hierarchies in Bayesian Network Learning
SARA '02 Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation
Optimizing Queries Across Diverse Data Sources
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Optimizing Recursive Information-Gathering Plans
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Learning Probabilistic Relational Models
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
M(DM): An Open Framework for Interoperation of Multimodel Multidatabase Systems
Proceedings of the Eighth International Conference on Data Engineering
Handbook of data mining and knowledge discovery
Simple Estimators for Relational Bayesian Classifiers
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Neural Computing and Applications
DiscoveryLink: a system for integrated access to life sciences data sources
IBM Systems Journal - Deep computing for the life sciences
K2/Kleisli and GUS: experiments in integrated access to genomic data sources
IBM Systems Journal - Deep computing for the life sciences
Learning classifiers from distributed, semantically heterogeneous, autonomous data sources
Learning classifiers from distributed, semantically heterogeneous, autonomous data sources
International Journal of Hybrid Intelligent Systems
On retrieval from a small version of a large data base
VLDB '80 Proceedings of the sixth international conference on Very Large Data Bases - Volume 6
Learning support vector machines from distributed data sources
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
Cached sufficient statistics for efficient machine learning with large datasets
Journal of Artificial Intelligence Research
Pattern discovery in distributed databases
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
DILS'05 Proceedings of the Second international conference on Data Integration in the Life Sciences
Learning Classifiers from Large Databases Using Statistical Queries
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Survey of modular ontology techniques and their applications in the biomedical domain
Integrated Computer-Aided Engineering - Selected papers from the IEEE Conference on Information Reuse and Integration (IRI), July 13-15, 2008
Learning Link-Based Naïve Bayes Classifiers from Ontology-Extended Distributed Data
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part II
Semantic translation for rule-based knowledge in data mining
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Using semantic web tools to integrate experimental measurement data on our own terms
OTM'06 Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part I
A service-oriented architecture for electric power transmission system asset management
ICSOC'06 Proceedings of the 4th international conference on Service-oriented computing
An iterative approach to build relevant ontology-aware data-driven models
Information Sciences: an International Journal
Hi-index | 0.00 |
Development of high throughput data acquisition technologies, together with advances in computing, and communications have resulted in an explosive growth in the number, size, and diversity of potentially useful information sources. This has resulted in unprecedented opportunities in data-driven knowledge acquisition and decision- making in a number of emerging increasingly data-rich application domains such as bioinformatics, environmental informatics, enterprise informatics, and social informatics (among others). However, the massive size, semantic heterogeneity, autonomy, and distributed nature of the data repositories present significant hurdles in acquiring useful knowledge from the available data. This paper introduces some of the algorithmic and statistical problems that arise in such a setting, describes algorithms for learning classifiers from distributed data that offer rigorous performance guarantees (relative to their centralized or batch counterparts). It also describes how this approach can be extended to work with autonomous, and hence, inevitably semantically heterogeneous data sources, by making explicit, the ontologies (attributes and relationships between attributes) associated with the data sources and reconciling the semantic differences among the data sources from a user’s point of view. This allows user or context-dependent exploration of semantically heterogeneous data sources. The resulting algorithms have been implemented in INDUS – an open source software package for collaborative discovery from autonomous, semantically heterogeneous, distributed data sources.