Capacity/storage tradeoff in high-dimensional identification systems

Authors:
Ertem Tuncel
Affiliations:
Department of Electrical Engineering, University of California, Riverside, CA
Venue:
IEEE Transactions on Information Theory
Year:
2009

Citing 7
Cited 2

Elements of information theory

Elements of information theory
Information Theory: Coding Theorems for Discrete Memoryless Systems

Information Theory: Coding Theorems for Discrete Memoryless Systems
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Rate-distortion approach to databases: storage and content-based retrieval

IEEE Transactions on Information Theory
On successive refinement for the Wyner-Ziv problem

IEEE Transactions on Information Theory
Achievable Rates for Pattern Recognition

IEEE Transactions on Information Theory
Successive Refinement for Hypothesis Testing and Lossless One-Helper Problem

IEEE Transactions on Information Theory

Identification over multiple databases

ISIT'09 Proceedings of the 2009 IEEE international conference on Symposium on Information Theory - Volume 4
Identification and secret-key generation in biometric systems with protected templates

Proceedings of the 12th ACM workshop on Multimedia and security

Quantified Score

Hi-index	754.84

Visualization

Abstract

The asymptotic tradeoff between the number of distinguishable objects and the necessary storage space (or equivalently, the search complexity) in an identification system is investigated. In the discussed scenario, high-dimensional (and noisy) feature vectors extracted from objects are first compressed and then enrolled in the database. When the user submits a random query object, the extracted noisy feature vector is compared against the compressed entries, one of which is output as the identified object. The first result this paper presents is a complete single-letter characterization of achievable storage and identification rates (measured in bits per feature dimension) subject to vanishing probability of identification error as the dimensionality of feature vectors becomes veri large. This single-letter characterization is then extended for a multistage system whereby depending on the number of entries, the identification is performed by utilizing part or all of the recorded bits in the database. Finally, it is shown that a necessary and sufficient condition for a two-stage system to achieve single-stage capacities at each stage is Markovity of the optimal test channels.