Translation lookaside buffer consistency: a software approach
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Scalable locality-conscious multithreaded memory allocation
Proceedings of the 5th international symposium on Memory management
K42: building a complete operating system
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Experience distributing objects in an SMMP OS
ACM Transactions on Computer Systems (TOCS)
The mechanics of in-kernel synchronization for a scalable microkernel
ACM SIGOPS Operating Systems Review
SNZI: scalable NonZero indicators
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Factored operating systems (fos): the case for a scalable operating system for multicores
ACM SIGOPS Operating Systems Review
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The Art of Multiprocessor Programming
The Art of Multiprocessor Programming
Corey: an operating system for many cores
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
An analysis of Linux scalability to many cores
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
C4: the continuously concurrent compacting collector
Proceedings of the international symposium on Memory management
Scalable address spaces using RCU balanced trees
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
SSMalloc: a low-latency, locality-conscious memory allocator with stable performance scalability
Proceedings of the Asia-Pacific Workshop on Systems
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
The scalable commutativity rule: designing scalable software for multicore processors
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Speedy transactions in multicore in-memory databases
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Hi-index | 0.00 |
RadixVM is a new virtual memory system design that enables fully concurrent operations on shared address spaces for multithreaded processes on cache-coherent multicore computers. Today, most operating systems serialize operations such as mmap and munmap, which forces application developers to split their multithreaded applications into multiprocess applications, hoard memory to avoid the overhead of returning it, and so on. RadixVM removes this burden from application developers by ensuring that address space operations on non-overlapping memory regions scale perfectly. It does so by combining three techniques: 1) it organizes metadata in a radix tree instead of a balanced tree to avoid unnecessary cache line movement; 2) it uses a novel memory-efficient distributed reference counting scheme; and 3) it uses a new scheme to target remote TLB shootdowns and to often avoid them altogether. Experiments on an 80 core machine show that RadixVM achieves perfect scalability for non-overlapping regions: if several threads mmap or munmap pages in parallel, they can run completely independently and induce no cache coherence traffic.