ACM Computing Surveys (CSUR)
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
Data Mining and Knowledge Discovery
On an Optimal Split Tree Problem
WADS '99 Proceedings of the 6th International Workshop on Algorithms and Data Structures
Approximating Min Sum Set Cover
Algorithmica
Decision trees for entity identification: approximation algorithms and hardness results
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Improving access to organized information
Improving access to organized information
Approximating Optimal Binary Decision Trees
APPROX '08 / RANDOM '08 Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques
Decision trees for entity identification: Approximation algorithms and hardness results
ACM Transactions on Algorithms (TALG)
Hi-index | 0.00 |
We consider the problem of constructing decision trees for entity identification from a given table. The input is a table containing information about a set of entities over a fixed set of attributes. The goal is to construct a decision tree that identifies each entity unambiguously by testing the attribute values such that the average number of tests is minimized. The previously best known approximation ratio for this problem was O (log2 N ). In this paper, we present a new greedy heuristic that yields an improved approximation ratio of O (logN ).