An approximate duplicate elimination in RFID data streams

Authors:
Chun-Hee Lee;Chin-Wan Chung
Affiliations:
Data Analytics Group, SAIT, Samsung Electronics, Yongin, 446-712, Republic of Korea;Department of Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 305-701, Republic of Korea
Venue:
Data & Knowledge Engineering
Year:
2011

Citing 18
Cited 0

Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Duplicate detection in click streams

WWW '05 Proceedings of the 14th international conference on World Wide Web
Temporal management of RFID data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Supporting RFID-based item tracking applications in Oracle DBMS using a bitmap datatype

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Warehousing and Analyzing Massive RFID Data Sets

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Approximately detecting duplicates for streaming data using stable bloom filters

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Adaptive cleaning for RFID data streams

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A deferred cleansing method for RFID data analytics

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Flowcube: constructing RFID flowcubes for multi-dimensional analysis of commodity flows

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Integrating automatic data acquisition with business processes experiences with SAP's auto-ID infrastructure

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Managing RFID data

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient storage scheme and query processing for supply chain management using RFID

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficiently Filtering Duplicates over Distributed Data Streams

CSSE '08 Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 04
RDFProv: A relational RDF store for querying and managing scientific workflow provenance

Data & Knowledge Engineering
Indexing and querying XML using extended Dewey labeling scheme

Data & Knowledge Engineering
Distributed inference and query processing for RFID tracking and monitoring

Proceedings of the VLDB Endowment
RFID Data Processing in Supply Chain Management Using a Path Encoding Scheme

IEEE Transactions on Knowledge and Data Engineering
Time parameterized interval r-tree for tracing tags in RFID systems

DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The RFID technology has been applied to a wide range of areas since it does not require contact in detecting RFID tags. However, due to the multiple readings in many cases in detecting an RFID tag and the deployment of multiple readers, RFID data contains many duplicates. Since RFID data is generated in a streaming fashion, it is difficult to remove duplicates in one pass with limited memory. We propose one pass approximate methods based on Bloom Filters using a small amount of memory. We first devise Time Bloom Filters as a simple extension to Bloom Filters. We then propose Time Interval Bloom Filters to reduce errors. Time Interval Bloom Filters need more space than Time Bloom Filters. We propose a method to reduce space for Time Interval Bloom Filters. Since Time Bloom Filters and Time Interval Bloom Filters are based on Bloom Filters, they do not produce false negative errors. Experimental results show that our approaches can effectively remove duplicates in RFID data streams in one pass with a small amount of memory.