When Is the Right Time to Refresh Knowledge Discovered from Data?

Authors:
Xiao Fang;Olivia R. Liu Sheng;Paulo Goes
Affiliations:
Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, Utah 84112;Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, Utah 84112;Department of Management Information Systems, Eller College of Management, University of Arizona, Tucson, Arizona 85721
Venue:
Operations Research
Year:
2013

Citing 31
Cited 1

Optimal reorganization policies for stationary and evolutionary

Management Science
Optimal update policies for distributed materialized views

Management Science
Incremental clustering for dynamic information processing

ACM Transactions on Information Systems (TOIS)
C4.5: programs for machine learning

C4.5: programs for machine learning
Applying update streams in a soft real-time database system

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The KDD process for extracting useful knowledge from volumes of data

Communications of the ACM
A framework for measuring changes in data characteristics

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Analytical modeling of materialized view maintenance

Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
The most important issues in knowledge management

Communications of the ACM
Incremental Induction of Decision Trees

Machine Learning
Amazon.com Recommendations: Item-to-Item Collaborative Filtering

IEEE Internet Computing
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Bayesian Models for Early Warning of Bank Failures

Management Science
Turning Datamining into a Management Science Tool: New Algorithms and Empirical Results

Management Science
Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation

Management Science
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Mining data streams: a review

ACM SIGMOD Record
Data Streams: Models and Algorithms (Advances in Database Systems)

Data Streams: Models and Algorithms (Advances in Database Systems)
Optimal Synchronization Policies for Data Warehouses

INFORMS Journal on Computing
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Incremental Maintenance of Online Summaries Over Multiple Streams

IEEE Transactions on Knowledge and Data Engineering
Maintaining Diagnostic Knowledge-Based Systems: A Control-Theoretic Approach

Management Science
AI for Global Disease Surveillance

IEEE Intelligent Systems
Editorial: Intelligence and security informatics: information systems perspective

Decision Support Systems - Special issue: Intelligence and security informatics
Data mining and revenue management methodologies in college admissions

Communications of the ACM
GIST: a model for design and management of content and interactivity of customer-centric web sites

MIS Quarterly
Analytic models for rollback and recovery strategies in data base systems

IEEE Transactions on Software Engineering

A model to support IT infrastructure planning and the allocation of IT governance authority

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Knowledge discovery in databases KDD techniques have been extensively employed to extract knowledge from massive data stores to support decision making in a wide range of critical applications. Maintaining the currency of discovered knowledge over evolving data sources is a fundamental challenge faced by all KDD applications. This paper addresses the challenge from the perspective of deciding the right times to refresh knowledge. We define the knowledge-refreshing problem and model it as a Markov decision process. Based on the identified properties of the Markov decision process model, we establish that the optimal knowledge-refreshing policy is monotonically increasing in the system state within every appropriate partition of the state space. We further show that the problem of searching for the optimal knowledge-refreshing policy can be reduced to the problem of finding the optimal thresholds and propose a method for computing the optimal knowledge-refreshing policy. The effectiveness and the robustness of the computed optimal knowledge-refreshing policy are examined through extensive empirical studies addressing a real-world knowledge-refreshing problem. Our method can be applied to refresh knowledge for KDD applications that employ major data-mining models.