An efficient parallel algorithm for the single function coarsest partition problem
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Cilk: an efficient multithreaded runtime system
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
Permutations generated by token passing in graphs
Theoretical Computer Science
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
Generalized Stack Permutations
Combinatorics, Probability and Computing
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Twenty-six moves suffice for Rubik's cube
Proceedings of the 2007 international symposium on Symbolic and algebraic computation
A comparative analysis of parallel disk-based Methods for enumerating implicit graphs
Proceedings of the 2007 international workshop on Parallel symbolic computation
Large implicit state space enumeration: overcoming memory and disk limitations
Large implicit state space enumeration: overcoming memory and disk limitations
Simulated Annealing with Iterative Improvement
ICSPS '09 Proceedings of the 2009 International Conference on Signal Processing Systems
Delayed duplicate detection: extended abstract
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A comparison of join algorithms for log processing in MaPreduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Parallel disk-based computation for large, monolithic binary decision diagrams
Proceedings of the 4th International Workshop on Parallel and Symbolic Computation
HaLoop: efficient iterative data processing on large clusters
Proceedings of the VLDB Endowment
Map-reduce extensions and recursive queries
Proceedings of the 14th International Conference on Extending Database Technology
Hi-index | 0.00 |
In order to keep up with the demand for solutions to problems with ever-increasing data sets, both academia and industry have embraced commodity computer clusters with locally attached disks or SANs as an inexpensive alternative to supercomputers. With the advent of tools for parallel disks programming, such as MapReduce, STXXL and Roomy --- that allow the developer to focus on higher-level algorithms --- the programmer productivity for memory-intensive programs has increased many-fold. However, such parallel tools were primarily targeted at iterative programs. We propose a programming model for migrating recursive RAM-based legacy algorithms to parallel disks. Many memory-intensive symbolic algebra algorithms are most easily expressed as recursive algorithms. In this case, the programming challenge is multiplied, since the developer must re-structure such an algorithm with two criteria in mind: converting a naturally recursive algorithm into an iterative algorithm, while simultaneously exposing any potential data parallelism (as needed for parallel disks). This model alleviates the large effort going into the design phase of an external memory algorithm. Research in this area over the past 10 years has focused on per-problem solutions, without providing much insight into the connection between legacy algorithms and out-of-core algorithms. Our method shows how legacy algorithms employing recursion and non-streaming memory access can be more easily translated into efficient parallel disk-based algorithms. We demonstrate the ideas on a largest computation of its kind: the determinization via subset construction and minimization of very large nondeterministic finite set automata (NFA). To our knowledge, this is the largest subset construction reported in the literature. Determinization for large NFA has long been a large computational hurdle in the study of permutation classes defined by token passing networks. The programming model was used to design and implement an efficient NFA determinization algorithm that solves the next stage in analyzing token passing networks representing two stacks in series.