Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Machine Learning - Special issue on learning with probabilistic representations
Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
Machine Learning
Learning Belief Networks in the Presence of Missing Values and Hidden Variables
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Querying Web Data - The WebQA Approach
WISE '02 Proceedings of the 3rd International Conference on Web Information Systems Engineering
The Bayesian structural EM algorithm
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Elicitation of probabilities for belief networks: combining qualitative and quantitative information
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Impact of imputation of missing values on classification error for discrete data
Pattern Recognition
User-Interest-Based document filtering via semi-supervised clustering
ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
Hi-index | 0.00 |
Machine learning is the science of building predictors from data while accounting for the predictor's accuracy on future data. Many machine learning classifiers can make accurate predictions when the data is complete. In the presence of insufficient data, statistical methods can be applied to fill in a few missing items. But these methods rely only on the available data to calculate the missing values and perform poorly if the percentage of missing values exceeds a threshold. An alternative is to fill in the missing data by an automated knowledge discovery process via mining the WWW. This novel procedure is applied by first restoring missing information and next learning the parameters of the classifier from the restored data. Using a Bayesian network as a classifier, the parameters, i.e., the probabilities associated with the causal relationships in the network, are deduced using the knowledge mined from the WWW in conjunction with the data available on hand. The method, when tested with heart disease data sets from the UC Irvine Machine Learning Repository [UCI repository of machine learning databases], gave satisfactory results.