A general approach to d-dimensional geometric queries
STOC '85 Proceedings of the seventeenth annual ACM symposium on Theory of computing
Efficient software-based fault isolation
SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Generating Linear Extensions Fast
SIAM Journal on Computing
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A case for intelligent disks (IDISKs)
ACM SIGMOD Record
Active disks: programming model, algorithms and evaluation
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
The Coign automatic distributed partitioning system
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Cluster I/O with River: making the fast case common
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Eddies: continuously adaptive query processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Content-Based Image Retrieval at the End of the Early Years
IEEE Transactions on Pattern Analysis and Machine Intelligence
Searching Multimedia Databases by Content
Searching Multimedia Databases by Content
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
Data Mining and Knowledge Discovery
Active Storage for Large-Scale Data Mining and Multimedia
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Distributed Computing with Load-Managed Active Storage
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Design and Evaluation of Smart Disk Architecture for DSS Commercial Workloads
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Dynamic sample selection for approximate query processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
On indexing large databases for advanced data models
On indexing large databases for advanced data models
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Dynamic function placement for data-intensive cluster computing
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Diamond: a storage architecture for early discard in interactive search
FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
MVSS: an active storage architecture
IEEE Transactions on Parallel and Distributed Systems
Automatic optimization of parallel dataflow programs
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Diamond: a storage architecture for early discard in interactive search
FAST'04 Proceedings of the 3rd USENIX conference on File and storage technologies
Rhea: automatic filtering for unstructured cloud storage
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
This paper explores the concept of early discard for interactive search of unindexed data. Processing data inside storage devices using downloaded searchlet code enables Diamond to perform efficient, applicationspecific filtering of large data collections. Early discard helps users who are looking for "needles in a haystack" by eliminating the bulk of the irrelevant items as early as possible. A searchlet consists of a set of application-generated filters that Diamond uses to determine whether an object may be of interest to the user. The system optimizes the evaluation order of the filters based on run-time measurements of each filter's selectivity and computational cost. Diamond can also dynamically partition computation between the storage devices and the host computer to adjust for changes in hardware and network conditions. Performance numbers show that Diamond dynamically adapts to a query and to run-time system state. An informal user study of an image retrieval application supports our belief that early discard significantly improves the quality of interactive searches.