Matrix multiplication via arithmetic progressions
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Finding and Counting Given Length Cycles (Extended Abstract)
ESA '94 Proceedings of the Second Annual European Symposium on Algorithms
Finding a minimum circuit in a graph
STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
Fine-grain multi-thread processor architecture for massively parallel processing
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Mambo: a full system simulator for the PowerPC architecture
ACM SIGMETRICS Performance Evaluation Review - Special issue on tools for computer architecture research
Design, implementation, and evaluation of the linear road bnchmark on the stream processing core
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Fast and memory-efficient regular expression matching for deep packet inspection
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
SPC: a distributed, scalable platform for data mining
Proceedings of the 4th international workshop on Data mining standards, services and platforms
The end of an architectural era: (it's time for a complete rewrite)
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
SPADE: the system s declarative stream processing engine
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
H-store: a high-performance, distributed main memory transaction processing system
Proceedings of the VLDB Endowment
A code generation approach to optimizing high-performance distributed data stream processing
Proceedings of the 18th ACM conference on Information and knowledge management
Tools and strategies for debugging distributed stream processing applications
Software—Practice & Experience
Visualizing large-scale streaming applications
Information Visualization
Efficient algorithms for large-scale local triangle counting
ACM Transactions on Knowledge Discovery from Data (TKDD)
Introduction to the wire-speed processor and architecture
IBM Journal of Research and Development
Clustering coefficient queries on massive dynamic social networks
WAIM'10 Proceedings of the 11th international conference on Web-age information management
From a stream of relational queries to distributed stream processing
Proceedings of the VLDB Endowment
Finding, counting and listing all triangles in large graphs, an experimental study
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Hi-index | 0.00 |
The proliferation of mobile devices, coupled with continuous connectivity, has resulted in a world where massive amounts of data is being produced, on a daily basis, as a result of online interactions between people. These interactions are often captured as relationships in a social network graph, by service providers such as mobile carriers or social web applications. Social network analysis is becoming a common technique for extracting business intelligence from social network graphs in order to improve customer experience and provide better service. Some applications in this domain require processing massive data flows with high throughput and low-latency, in order to deliver timely results. SpamWatcher is a streaming social network analysis application that fits this description. It is used for real-time filtering of short messages in mobile communications, with the goal of preventing spam. The ever increasing volume of mobile users and rates of messages make realtime detection of spam a challenging problem with respect to performance and scalability. In this paper, we present a solution for the SpamWatcher application using the IBM wire-speed processor - a system-on-a-chip with specialized co-processors and integrated network I/O. This solution goes beyond the state-of-the-art by (i) using a novel implementation technique that takes advantage of the pattern matching accelerator to minimize the latency of spam detection, and (ii) employing hardware primitives to reduce the overhead caused by thread synchronization in order to achieve good scalability with respect to number of cores used. Furthermore, the solution is implemented on System S - a commercial-grade stream processing middleware. We evaluate our approach using real-world data sets and experimentally demonstrate the substantial performance improvements it achieves compared to previously published results.