Integer and combinatorial optimization
Integer and combinatorial optimization
On solving the continuous data editing problem
Computers and Operations Research
Logical analysis of numerical data
Mathematical Programming: Series A and B - Special issue: papers from ismp97, the 16th international symposium on mathematical programming, Lausanne EPFL
Database Management Systems
A Logic Programming Approach to the Integration, Repairing and Querying of Inconsistent Databases
Proceedings of the 17th International Conference on Logic Programming
Errors Detection and Correction in Large Scale Data Collecting
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
Hi-index | 0.04 |
The paper is concerned with the problem of automatic detection and correction of inconsistent or out of range data in a general process of statistical data collection. The proposed approach is able to deal with hierarchical data containing both qualitative and quantitative values. As customary, erroneous data records are detected by formulating a set of rules. Erroneous records should then be corrected, by modifying as less as possible the erroneous data, while causing minimum perturbation to the original frequency distributions of the data. Such process is called imputation. By encoding the rules with linear inequalities, we convert imputation problems into integer linear programming problems. The proposed procedure is tested on a real-world case of census. Results are extremely encouraging both from the computational and from the data quality point of view.