Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
Theoretical Computer Science
Bounded scheduling of process networks
Bounded scheduling of process networks
An MPEG-2 decoder case study as a driver for a system level design methodology
CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
Packet reordering is not pathological network behavior
IEEE/ACM Transactions on Networking (TON)
YAPI: application modeling for signal processing systems
Proceedings of the 37th Annual Design Automation Conference
Proceedings of the 37th Annual Design Automation Conference
Improving TCP performance over mobile ad-hoc networks with out-of-order detection and response
Proceedings of the 3rd ACM international symposium on Mobile ad hoc networking & computing
Ptolemy: a framework for simulating and prototyping heterogeneous systems
Readings in hardware/software co-design
JPEG Still Image Data Compression Standard
JPEG Still Image Data Compression Standard
Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Multi-periodic Process Networks: Prototyping and Verifying Stream-Processing Systems
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
StreamIt: A Language for Streaming Applications
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Time Stream Petri Nets: A Model for Timed Multimedia Information
Proceedings of the 15th International Conference on Application and Theory of Petri Nets
Metrics for Degree of Reordering in Packet Sequences
LCN '02 Proceedings of the 27th Annual IEEE Conference on Local Computer Networks
Network Systems Design Using Network Processors
Network Systems Design Using Network Processors
System Design Using Kahn Process Networks: The Compaan/Laura Approach
Proceedings of the conference on Design, automation and test in Europe - Volume 1
On Reorder Density and its Application to Characterization of Packet Reordering
LCN '05 Proceedings of the The IEEE Conference on Local Computer Networks 30th Anniversary
N-synchronous Kahn networks: a relaxed model of synchrony for real-time systems
Conference record of the 33rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Parallel downloads for streaming applications: a resequencing analysis
Performance Evaluation
pn: a tool for improved derivation of process networks
EURASIP Journal on Embedded Systems
Physical Implementation of the DSPIN Network-on-Chip in the FAUST Architecture
NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
A Buffer Space Optimal Solution for Re-establishing the Packet Order in a MPSoC Network Processor
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Platform-based design from parallel C specifications
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Systematic and Automated Multiprocessor System Design, Programming, and Implementation
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
Many streaming applications feature coarse grain task farm or pipeline parallelism and can be modeled as a set of parallel threads. Performance requirements can often only be met by mapping the application onto a Multi Processor System-on-Chip (MPSoC). To avoid contention, hierarchical interconnection networks, where the central interconnect is a network-on-chip, are employed. In such a clustered MPSoC, the memory access latency varies strongly depending on the location of data, and is the principal cause of out-of-order arrival of data items. We present an algorithm which re-establishes the order of data items on the output side. If their earliness or lateness exceeds a limit previously fixed by experimentation, they are dropped, otherwise stored in a buffer. Write operations to this buffer are random access, whereas read operations are in FIFO order. Our algorithm guarantees that no data is removed from the buffer before it has been read, and, for a given throughput, minimum buffer size. The algorithm was implemented within the output co-processors for three application case studies and validated on a simulation platform based on the SoCLib library.