RuRot: Run-Time Rotatable-Expandable Partitions for Efficient Mapping in CGRAs

A4 Conference proceedings

Internal Authors/Editors

Publication Details

List of Authors: Jafri SMAH, Serrano G, Iqbal J, Daneshtalab M, Hemani A, Paul K, Plosila J, Tenhunen H
Editors: Veidenbaum AV
Publisher: IEEE
Place: Agios Konstantinos
Publication year: 2014
Publisher: IEEE
Book title: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulations (SAMOS)
Start page: 233
End page: 241
ISBN: 978-1-4799-3770-7


Today, Coarse Grained Reconfigurable Architectures (CGRAs) host multiple applications, with arbitrary communication and computation patterns. Compile-time mapping decisions are neither optimal nor desirable to efficiently support the diverse and unpredictable application requirements. As a solution to this problem, recently proposed architectures offer run-time remapping. The run-time remappers displace or expand (parallelize/serialize) an application to optimize different parameters (such as platform utilization). However, the existing remappers support application displacement or expansion in either horizontal or vertical direction. Moreover, most of the works only address dynamic remapping in packet-switched networks and therefore are not applicable to the CGRAs that exploit circuitswitching for low-power and high predictability. To enhance the optimality of the run-time remappers, this paper presents a design framework called Run-time Rotatable-expandable Partitions (RuRot). RuRot provides architectural support to dynamically remap or expand (i.e. parallelize) the hosted applications in CGRAs with circuit-switched interconnects. Compared to state of the art, the proposed design supports application rotation (in clockwise and anticlockwise directions) and displacement (in horizontal and vertical directions), at run-time. Simulation results using a few applications reveal that the additional flexibility enhances the device utilization, significantly (on average 50 % for the tested applications). Synthesis results confirm that the proposed remapper has negligible silicon (0.2 % of the platform) and timing (2 cycles per application) overheads.

Last updated on 2020-04-04 at 04:59