Provide development information

PMSS: A programmable memory system and scheduler for complex memory patterns

Authors: Tassadaq Hussain, Amna Haider and Eduard Ayguade
Publication date: 2014
Journal: Journal of Parallel and Distributed Computing

Description:

Abstract HPC industry demands more computing units on FPGAs, to enhance the
performance by using task/data parallelism. FPGAs can provide its ultimate performance on
certain kernels by customizing the hardware for the applications. However, applications are
getting more complex, with multiple kernels and complex data arrangements, generating
overhead while scheduling/managing system resources. Due to this reason all classes of
multi threaded machines–minicomputer to supercomputer–require to have efficient

— Viewd 279 TIme

Programmable Memory Controller for Vector System-on-Chip

Programmable Memory Controller for Vector System-on-Chip

Oct 30, 2014

Author: Tassadaq Hussain

Publication date: 2012

Source: The seventh Microsoft Research Summer School, Microsoft Research in Cambridge, U.K

 

Programmable Memory Controller for Vector System-on-Chip

— Viewd 242 TIme

PPMC : A Programmable Pattern based Memory Controller.

PPMC : A Programmable Pattern based Memory Controller.

Jan 4, 2013

Authors: Hussain Tassadaq,Muhammad Shafiq, Miquel Pericas, Nacho Navarro, Eduard Ayguade.
ARC 2012, the 8th International Symposium on Applied Reconfigurable Computing (2012).

Download Link

One of the main challenges in the design of hardware accelerators is the efficient access of data from the external memory. Improving and optimizing the functionality of the memory controller between the external memory and the accelerators is therefore critical. In this paper, we advance toward this goal by proposing PPMC, the Programmable Pattern-based Memory Controller. This controller supports scatter-gather and strided 1D, 2D and 3D accesses with programmable tiling. Compared to existing solutions, the proposed system provides better performance, simplifies programming access patterns and eases software integration by interfacing to high-level programming languages. In addition, the controller offers an interface for automating domain decomposition via tiling. We implemented and tested PPMC on a Xilinx ML505 evaluation board using a MicroBlaze soft-core as the host processor. The evaluation uses six memory intensive application kernels: Laplacian solver, FIR, FFT, Thresholding, Matrix Multiplication, and 3D-Stencil. The results show that the PPMC-enhanced system achieves at least 10x speed-ups for 1D, 2D and 3D memory accesses as compared to a non-PPMC based setup.

thumb— Viewd 591 TIme

PPMC : Hardware Scheduling and Memory Management support for Multi Hardware Accelerators.

PPMC : Hardware Scheduling and Memory Management support for Multi Hardware Accelerators.

Jan 4, 2013

Authors: Hussain Tassadaq, Miquel Pericas, Nacho Navarro, Eduard Ayguade.
FPL2012 | 22nd International Conference on Field Programmable Logic and Applications.

Download Link

A generic multi-accelerator system comprises a microprocessor unit that schedules the accelerators along with the necessary data movements. The system, having the processor as control unit, encounters multiple delays (memory and task management) which degrade the overall system performance. This performance degradation demands an efficient memory manager and high speed scheduler, which feeds prearranged data to the appropriate accelerator. In this work we propose the integration of an efficient scheduler and an intelligent memory manger into an existing core known as PPMC (Programmable Pattern based Memory Controller), such that data movement and computational tasks can be handled proficiently. Consequently, the modified PPMC system improves performance by managing data movements and address generation in hardware and scheduling accelerators without the intervention of a control processor nor an operating system. The PPMC system is evaluated with six memory intensive accelerators: Laplacian solver, FIR, FFT, Thresholding, Matrix Multiplication and 3DStencil. This modified PPMC system is implemented and tested on a Xilinx ML505 evaluation FPGA board. The performance of the system is compared with a microprocessor based system that has been integrated with the Xilkernel operating system. Results show that the modified PPMC based multi-accelerator system consumes 50% less hardware resources, 32% less on-chip power and achieves approximately a 27 speed-up compared to the MicroBlaze-based system.

thumb— Viewd 3085 TIme

Implementation of a Reverse Time Migration Kernel using the HCE High Level Synthesis Tool.

Implementation of a Reverse Time Migration Kernel using the HCE High Level Synthesis Tool.

Jan 4, 2013

Authors: Tassadaq Hussain, Miquel Pericas, Nacho Navarro, Eduard Ayguade.
The 2011 International Conference on Field-Programmable Technology FPT 2011 IIT Delhi New Delhi, India (2011)

Download Link

Abstract—Reconfigurable computers have started to appear in the HPC landscape, albeit at a slow pace. Adoption is still being hindered by the design methodologies and slow implementation cycles. Recently, methodologies based on High Level Synthesis (HLS) have begun to flourish and the reconfigurable supercomputing community is slowly adopting these techniques. In this paper we took a geophysics application and implemented it on FPGA using a HLS tool called HCE. The application, Reverse Time Migration, is an important code for subsalt imaging. It is also a highly demanding code both in computationally as in its memory requirements. The complexity of this code makes it challenging to implement it using a HLS methodology instead of HDL. We study the achieved performance and compare it with hand-written HDL and also with software based execution.
The resulting design, when implemented on the Altera Stratix IV EP4SGX230 and EP4SGX530 devices achieves 11.2 and 22
GFLOPS respectively. On these devices, the design was capable of achieving up to 4.2x and 7.9x improvement, espectively, over a general purpose processor core (Intel i7).

thumb— Viewd 801 TIme

Reconfigurable Memory Controller with Programmable Pattern Support.

Reconfigurable Memory Controller with Programmable Pattern Support.

Jan 4, 2013

Authors: Hussain Tassadaq, Miquel Pericas, Nacho Navarro, Eduard Ayguade.
Plublished: 5th HiPEAC Workshop on Reconfigurable Computing, WRC 2011.


Download Link

Heterogeneous architectures are increasingly popular due to their flexibility and high performance per watt capability. A kind of heterogeneous architecture, reconfigurable systems-on-chip, offer high performance per watt through the reconfigurable logic and flexibility via multiprocessor cores. But in order to achieve the performance goals it is necessary to provide enough data to the accelerators.
In this paper we describe a programmable, pattern-based memory controller (PMC) that aims at improving the performance of heterogeneous or reconfigurable SoC devices. These include scatter gather and strided 1D, 2D and 3D patterns. PMC can prefetch complete patterns into scratchpads that can then be accessed either by a microprocessor or by an accelerator. As a result, the microprocessors and accelerators can focus on computation and are relieved of having
to perform address calculations. PMC has been implemented and tested on an ML505 evaluation board using the MicroBlaze softcore as the platform’s microprocessor.
While PMC adds some latency, it improves performance by offloading the processor and by making better use of available bandwidths. The PMC provide 1.5x speed-ups with processor and 27x speed-ups achieved by using hardware accelerator in PMC SoC based environment while executing thresholding application.

thumb— Viewd 696 TIme