Authors:
Abstract: Large-scale scientific simulations can be ported to heterogeneous environments with GPUs using domain decomposition. However, Fast Fourier Transform (FFT) based simulations require all-to-all communication and large memory, which is beyond the capacity of on-chip GPU memory. To overcome this, domain decomposition solutions are combined with adaptive sampling or pruning around the domain to reduce storage. Expression of such operations is a challenge in existing FFT libraries like FFTW, and thus it is difficult to get a high performance implementation of such methods. We demonstrate algorithm specification for one such simulation (Hooke's law) using FFTX, an emerging API with a SPIRAL-based code generation back-end, and suggest future extensions useful for GPU-based scientific computing.
Best Poster Finalist (BP): no
Poster: PDF
Poster summary: PDF
Back to Poster Archive Listing