Leveraging spatio-temporal redundancy for RFID data cleansing

Authors:
Haiquan Chen;Wei-Shinn Ku;Haixun Wang;Min-Te Sun
Affiliations:
Auburn University, Auburn, AL, USA;Auburn University, Auburn, AL, USA;Microsoft Research Asia, Beijing, China;National Central University, Taoyuan, Taiwan ROC
Venue:
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Year:
2010

Citing 21
Cited 11

The Magic of RFID

Queue - RFID
Temporal management of RFID data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
U-DBMS: a database system for managing constantly-evolving data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
HiQ: A Hierarchical Q-Learning Algorithm to Solve the Reader Collision Problem

SAINT-W '06 Proceedings of the International Symposium on Applications on Internet Workshops
Clean Answers over Dirty Databases: A Probabilistic Approach

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Warehousing and Analyzing Massive RFID Data Sets

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Towards correcting input data errors probabilistically using integrity constraints

MobiDE '06 Proceedings of the 5th ACM international workshop on Data engineering for wireless and mobile access
Adaptive cleaning for RFID data streams

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A deferred cleansing method for RFID data analytics

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Trio: a system for data, uncertainty, and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Introduction to Probability Models, Ninth Edition

Introduction to Probability Models, Ninth Edition
Tag-Splitting: Adaptive Collision Arbitration Protocols for RFID Tag Identification

IEEE Transactions on Parallel and Distributed Systems
Efficient query evaluation on probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Managing RFID data

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Query language support for incomplete information in the MayBMS system

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
An adaptive RFID middleware for supporting metaphysical data independence

The VLDB Journal — The International Journal on Very Large Data Bases
MCDB: a monte carlo approach to managing uncertain data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Probabilistic Event Extraction from RFID Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
A Sampling-Based Approach to Information Recovery

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Probabilistic Inference over RFID Streams in Mobile Environments

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Declarative support for sensor data cleaning

PERVASIVE'06 Proceedings of the 4th international conference on Pervasive Computing

Distributed inference and query processing for RFID tracking and monitoring

Proceedings of the VLDB Endowment
Querying uncertain data with aggregate constraints

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Leveraging communication information among readers for RFID data cleaning

WAIM'11 Proceedings of the 12th international conference on Web-age information management
KLEAP: an efficient cleaning method to remove cross-reads in RFID streams

Proceedings of the 20th ACM international conference on Information and knowledge management
Mining probabilistically frequent sequential patterns in uncertain databases

Proceedings of the 15th International Conference on Extending Database Technology
Leveraging read rates of passive RFID tags for real-time indoor location tracking

Proceedings of the 21st ACM international conference on Information and knowledge management
A model-based approach for RFID data stream cleansing

Proceedings of the 21st ACM international conference on Information and knowledge management
Get tracked: a triple store for RFID traceability data

ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
An RFID and particle filter-based indoor spatial query evaluation system

Proceedings of the 16th International Conference on Extending Database Technology
Probabilistic inference of object identifications for event stream analytics

Proceedings of the 16th International Conference on Extending Database Technology
Big data challenge: a data management perspective

Frontiers of Computer Science: Selected Publications from Chinese Universities

Quantified Score

Hi-index	0.00

Visualization

Abstract

Radio Frequency Identification (RFID) technologies are used in many applications for data collection. However, raw RFID readings are usually of low quality and may contain many anomalies. An ideal solution for RFID data cleansing should address the following issues. First, in many applications, duplicate readings (by multiple readers simultaneously or by a single reader over a period of time) of the same object are very common. The solution should take advantage of the resulting data redundancy for data cleaning. Second, prior knowledge about the readers and the environment (e.g., prior data distribution, false negative rates of readers) may help improve data quality and remove data anomalies, and a desired solution must be able to quantify the degree of uncertainty based on such knowledge. Third, the solution should take advantage of given constraints in target applications (e.g., the number of objects in a same location cannot exceed a given value) to elevate the accuracy of data cleansing. There are a number of existing RFID data cleansing techniques. However, none of them support all the aforementioned features. In this paper we propose a Bayesian inference based approach for cleaning RFID raw data. Our approach takes full advantage of data redundancy. To capture the likelihood, we design an n-state detection model and formally prove that the 3-state model can maximize the system performance. Moreover, in order to sample from the posterior, we devise a Metropolis-Hastings sampler with Constraints (MH-C), which incorporates constraint management to clean RFID raw data with high efficiency and accuracy. We validate our solution with a common RFID application and demonstrate the advantages of our approach through extensive simulations.