Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The design of RPM: an FPGA-based multiprocessor emulator
FPGA '95 Proceedings of the 1995 ACM third international symposium on Field-programmable gate arrays
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Multithreaded programming with Pthreads
Multithreaded programming with Pthreads
Software transactional memory for dynamic-sized data structures
Proceedings of the twenty-second annual symposium on Principles of distributed computing
Language support for lightweight transactions
OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
Programming with transactional coherence and consistency (TCC)
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Unbounded Transactional Memory
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
BEE2: A High-End Reconfigurable Computing System
IEEE Design & Test
Virtualizing Transactional Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
AtomCaml: first-class atomicity via rollback
Proceedings of the tenth ACM SIGPLAN international conference on Functional programming
TAPE: a transactional application profiling environment
Proceedings of the 19th annual international conference on Supercomputing
Characterization of TCC on Chip-Multiprocessors
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
A chip prototyping substrate: the flexible architecture for simulation and testing (FAST)
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Design, implementation, and verification of active cache emulator (ACE)
Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
McRT-STM: a high performance software transactional memory system for a multi-core runtime
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Multiple Instruction Stream Processor
Proceedings of the 33rd annual international symposium on Computer Architecture
Tradeoffs in transactional memory virtualization
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Adaptive software transactional memory
DISC'05 Proceedings of the 19th international conference on Distributed Computing
ATLAS: a chip-multiprocessor with transactional memory support
Proceedings of the conference on Design, automation and test in Europe
Parallelization of IBM mambo system simulator in functional modes
ACM SIGOPS Operating Systems Review
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Atom-Aid: Detecting and Surviving Atomicity Violations
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Hardwired Networks on Chip in FPGAs to Unify Functional and Con?guration Interconnects
NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
ACM SIGARCH Computer Architecture News
Performance evaluation of concurrently executing parallel applications on multi-processor systems
SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
COREMU: a scalable and portable parallel full-system emulator
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
NetTM: faster and easier synchronization for soft multicores via transactional memory
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
From plasma to beefarm: design experience of an FPGA-based multicore prototype
ARC'11 Proceedings of the 7th international conference on Reconfigurable computing: architectures, tools and applications
Application-specific signatures for transactional memory in soft processors
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
An FPGA-based scalable simulation accelerator for tile architectures
ACM SIGARCH Computer Architecture News
Application-specific signatures for transactional memory in soft processors
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
A multi-level design methodology of multistage interconnection network for MPSOCs
International Journal of Computer Applications in Technology
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
A customisable multiprocessor for application-optimised inductive logic programming
VoCS'08 Proceedings of the 2008 international conference on Visions of Computer Science: BCS International Academic Conference
Resource-bounded multicore emulation using Beefarm
Microprocessors & Microsystems
Area-efficient near-associative memories on FPGAs
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Hi-index | 0.00 |
Chip-multiprocessors are quickly gaining momentum in all segments of computing. However, the practical success of CMPs strongly depends on addressing the difficulty of multithreaded application development. To address this challenge, it is necessary to co-develop new CMP architecture with novel programming models. Currently, architecture research relies on software simulators which are too slow to facilitate interesting experiments with CMP software without using small datasets or significantly reducing the level of detail in the simulated models. An alternative to simulation is to exploit the rich capabilities of modern FPGAs to create FPGA-based platforms for novel CMP research. This paper presents ATLAS, the first prototype for CMPs with hardware support for Transactional Memory (TM), a technology aiming to simplify parallel programming. ATLAS uses the BEE2 multi-FPGA board to provide a system with 8 PowerPC cores that run at 100MHz and runs Linux. ATLAS provides significant benefits for CMP research such as 100x performance improvement over a software simulator and good visibility that helps with software tuning and architectural improvements. In addition to presenting and evaluating ATLAS, we share our observations about building a FPGA-based framework for CMP research. Specifically, we address issues such as overall performance, challenges of mapping ASIC-style CMP RTL on to FPGAs, software support, the selection criteria for the base processor, and the challenges of using pre-designed IP libraries.