Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Query processing and optimization in Oracle Rdb
The VLDB Journal — The International Journal on Very Large Data Bases
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Inverted Index Compression Using Word-Aligned Binary Codes
Information Retrieval
The OSU Flow-tools Package and CISCO NetFlow Logs
LISA '00 Proceedings of the 14th USENIX conference on System administration
More Netflow Tools for Performance and Security
LISA '04 Proceedings of the 18th USENIX conference on System administration
Compressing Bitmap Indices by Data Reorganization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
BLINC: multilevel traffic classification in the dark
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Super-Scalar RAM-CPU Cache Compression
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Optimizing bitmap indices with efficient compression
ACM Transactions on Database Systems (TODS)
Integrating compression and execution in column-oriented database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Performance tradeoffs in read-optimized databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Tribeca: a system for managing large databases of network traffic
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
Enabling Real-Time Querying of Live and Historical Stream Data
SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
On the performance of bitmap indices for high cardinality attributes
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
RLH: bitmap compression technique based on run-length and huffman encoding
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Multi-probe LSH: efficient indexing for high-dimensional similarity search
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Bigtable: A Distributed Storage System for Structured Data
ACM Transactions on Computer Systems (TOCS)
Application of bitmap index to information retrieval
Proceedings of the 17th international conference on World Wide Web
Column-stores vs. row-stores: how different are they really?
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Breaking the memory wall in MonetDB
Communications of the ACM - Surviving the data deluge
Read-optimized databases, in depth
Proceedings of the VLDB Endowment
Histogram-aware sorting for enhanced word-aligned compression in bitmap indexes
Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
Dynamic data organization for bitmap indices
Proceedings of the 3rd international conference on Scalable information systems
Scale-Up Strategies for Processing High-Rate Data Streams in System S
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Sorting improves word-aligned bitmap indexes
Data & Knowledge Engineering
Position list word aligned hybrid: optimizing space and performance for compressed bitmaps
Proceedings of the 13th International Conference on Extending Database Technology
Digging into HTTPS: flow-based classification of webmail traffic
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Data structures: time, I/Os, entropy, joules!
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
NetStore: an efficient storage infrastructure for network forensics and monitoring
RAID'10 Proceedings of the 13th international conference on Recent advances in intrusion detection
Database compression on graphics processors
Proceedings of the VLDB Endowment
NET-FLi: on-the-fly compression, archiving and indexing of streaming network traffic
Proceedings of the VLDB Endowment
Reordering rows for better compression: Beyond the lexicographic order
ACM Transactions on Database Systems (TODS)
Toward efficient querying of compressed network payloads
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Flow information storage assessment using IPFIXcol
AIMS'12 Proceedings of the 6th IFIP WG 6.6 international autonomous infrastructure, management, and security conference on Dependable Networks and Services
RasterZip: compressing network monitoring data with support for partial decompression
Proceedings of the 2012 ACM conference on Internet measurement conference
Indexing million of packets per second using GPUs
Proceedings of the 2013 conference on Internet measurement conference
Hi-index | 0.00 |
High-speed archival and indexing solutions of streaming traffic are growing in importance for applications such as monitoring, forensic analysis, and auditing. Many large institutions require fast solutions to support expedient analysis of historical network data, particularly in case of security breaches. However, "turning back the clock" is not a trivial task. The first major challenge is that such a technology needs to support data archiving under extremely high-speed insertion rates. Moreover, the archives created have to be stored in a compressed format that is still amenable to indexing and search. The above requirements make general-purpose databases unsuitable for this task and dedicated solutions are required. This work describes a solution for high-speed archival storage, indexing, and data querying on network flow information. We make the two following important contributions: (a) we propose a novel compressed bitmap index approach that significantly reduces both CPU load and disk consumption and, (b) we introduce an online stream reordering mechanism that further reduces space requirements and improves the time for data retrieval. The reordering methodology is based on the principles of locality-sensitive hashing (LSH) and also of interest for other bitmap creation techniques. Because of the synergy of these two components, our solution can sustain data insertion rates that reach 500,000---1 million records per second. To put these numbers into perspective, typical commercial network flow solutions can currently process 20,000---60,000 flows per second. In addition, our system offers interactive query response times that enable administrators to perform complex analysis tasks on the fly. Our technique is directly amenable to parallel execution, allowing its application in domains that are challenged by large volumes of historical measurement data, such as network auditing, traffic behavior analysis, and large-scale data visualization in service provider networks.