Effective monitoring by efficient fingerprint matching using a forest of NAQ-trees

Authors:
Ming Zhang;Keivan Kianmehr;Reda Alhajj
Affiliations:
Department of Computer Science, University of Calgary, Calgary, Canada;Department of Electrical and Computer Engineering, The University of Western Ontario, London, Canada;Department of Computer Science, University of Calgary, Calgary, Canada and Department of Computer Science, Global University, Beirut, Lebanon
Venue:
Journal of Intelligent Information Systems
Year:
2011

Citing 30
Cited 0

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Distance-based indexing for high-dimensional metric spaces

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Data structures and algorithms for nearest neighbor search in general metric spaces

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Decision-Centric Information Monitoring

Journal of Intelligent Information Systems
Dimensionality reduction and similarity computation by inner product approximations

Proceedings of the ninth international conference on Information and knowledge management
Multidimensional binary search trees used for associative searching

Communications of the ACM
Ensemble-index: a new approach to indexing large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering Algorithms

Clustering Algorithms
The K-D-B-tree: a search structure for large multidimensional dynamic indexes

SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes

EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods

Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
An adaptive personalized news dissemination system

Journal of Intelligent Information Systems
Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space

Knowledge and Information Systems
When is nearest neighbors indexable?

ICDT'05 Proceedings of the 10th international conference on Database Theory
Modular fuzzy-reinforcement learning approach with internal model capabilities for multiagent systems

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Positive Impact of State Similarity on Reinforcement Learning Performance

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Adaptive Sensor Placement and Boundary Estimation for Monitoring Mass Objects

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Incremental Linear Discriminant Analysis for Face Recognition

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A Patient Transfer Apparatus Between Bed and Stretcher

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Quantified Score

Hi-index	0.02

Visualization

Abstract

Sensor devices have been widely used in many applications, e.g., security, wildlife monitoring, critical health cases, etc. The sensors constantly capture information about the monitored case and encode the information into feature vectors, called fingerprints, which are sent to a central server for further analysis; the process is generally semi-automated. To ease the on-line analysis, the central server should maintain a reference database containing standard fingerprints representing the status of known conditions. The key operation is to find the matchings (i.e., nearest neighbors) for each fingerprint arriving from the remote sensor devices; thus the current status of each sensor device can be automatically determined. As the fingerprints are usually characterized by hundreds of dimensions and quick response is mostly the top priority in sensor based monitoring applications, the existing index structures for nearest neighbor search fail to properly satisfy such applications. In this paper, we propose a method that allows for fully automated monitoring by efficiently reporting the matchings for most fingerprints sent by the sensor devices. The proposed method consists of two steps; the first step clusters the reference database into r-separable clusters and one fingerprint (i.e., the centroid) is selected to represent each cluster. The second step builds indexes for the representative fingerprints using a set of NAQ-trees residing on multiple nodes of a parallel machine. In the query processing phase, the built indexes are queried in parallel and from each tree only a very small number of index nodes are searched to report the partial results, which are combined into the final result. Taking advantage of the "randomization" property and compact partitioning of the NAQ-tree construction, the union of the partial results is anticipated to cover most of the matchings; this has been demonstrated in the experiments that have been conducted to emphasize the applicability and effectiveness of the proposed approach.