Optimal imputation of erroneous data: Categorical data, general edits
Operations Research
Handbook of record linkage: methods for health and statistical studies, administration, and business
Handbook of record linkage: methods for health and statistical studies, administration, and business
The nature of statistical learning theory
The nature of statistical learning theory
Probabilistic frame-based systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Improving data warehouse and business information quality: methods for reducing costs and increasing profits
The string B-tree: a new data structure for string search in external memory and its applications
Journal of the ACM (JACM)
Foundations of Probabilistic and Utility-Theoretic Indexing
Journal of the ACM (JACM)
Term Weighting in Information Retrieval Using the Term Precision Model
Journal of the ACM (JACM)
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Automatic spelling correction in scientific and scholarly text
Communications of the ACM
Enterprise knowledge management: the data quality approach
Enterprise knowledge management: the data quality approach
Record linkage: making maximum use of the discriminating power of identifying information
Communications of the ACM
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Learning object identification rules for information integration
Information Systems - Data extraction, cleaning and reconciliation
Machine Learning
Data Quality for the Information Age
Data Quality for the Information Age
Assignment and Matching Problems: Solution Methods with FORTRAN-Programs
Assignment and Matching Problems: Solution Methods with FORTRAN-Programs
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to match and cluster large high-dimensional data sets for data integration
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Record Linkage in Large Data Sets
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
TAILOR: A Record Linkage Tool Box
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Eliminating fuzzy duplicates in data warehouses
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Data quality awareness: a case study for cost optimal association rule mining
Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Improving data quality: consistency and accuracy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Dimensional issues in agricultural data warehouse designs
Computers and Electronics in Agriculture
Conditional functional dependencies for capturing data inconsistencies
ACM Transactions on Database Systems (TODS)
Dependencies revisited for improving data quality
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A two-step classification approach to unsupervised record linkage
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Evaluation of a graduate level data mining course with industry participants
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
A comprehensive data quality methodology for web and structured data
International Journal of Innovative Computing and Applications
Automatic record linkage using seeded nearest neighbour and support vector machine classification
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A proposal for a set of attributes relevant for Web portal data quality
Software Quality Control
Incorporating Domain-Specific Information Quality Constraints into Database Queries
Journal of Data and Information Quality (JDIQ)
Similarity-aware indexing for real-time entity resolution
Proceedings of the 18th ACM conference on Information and knowledge management
ACM SIGKDD Explorations Newsletter
Reasoning about record matching rules
Proceedings of the VLDB Endowment
Beyond k-Anonymity: A Decision Theoretic Framework for Assessing Privacy Risk
Transactions on Data Privacy
Automatic training example selection for scalable unsupervised record linkage
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Consistent query answers in inconsistent probabilistic databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Social Network Analysis and Mining for Business Applications
ACM Transactions on Intelligent Systems and Technology (TIST)
Preventing human error: The impact of data entry methods on data accuracy and statistical results
Computers in Human Behavior
Dynamic constraints for record matching
The VLDB Journal — The International Journal on Very Large Data Bases
Cost-efficient repair in inconsistent probabilistic databases
Proceedings of the 20th ACM international conference on Information and knowledge management
Defining a data quality model for web portals
WISE'06 Proceedings of the 7th international conference on Web Information Systems
A first approach to a data quality model for web portals
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part III
International Journal of Web Based Communities
A taxonomy of privacy-preserving record linkage techniques
Information Systems
Information quality measurement of medical encoding support based on usability
Computer Methods and Programs in Biomedicine
Hi-index | 0.00 |
This paper provides a survey of two classes of methods that can be used in determining and improving the quality of individual files or groups of files. The first are edit/imputation methods for maintaining business rules and for imputing for missing data. The second are methods of data cleaning for finding duplicates within files or across files.