An MPI-CUDA Implementation for Massively Parallel Incompressible - 源码 - 源码 - 免费下载

An MPI-CUDA Implementation for Massively Parallel Incompressible

文件大小： 1161k

源码售价： 10 个金币积分规则积分充值

资源说明：MPI和CUDA结合实现并行 Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications tremendously. While multiGPU workstations with several TeraFLOPS of peak computing power are available to accelerate computational problems, larger problems require even more resources. Conventional clusters of central processing units (CPU) are now being augmented with multiple GPUs in each compute-node to tackle large problems. The heterogeneous architecture of a multi-GPU cluster with a deep memory hierarchy creates unique challenges in developing scalable and efficient simulation codes. In this study, we pursue mixed MPI-CUDA implementations and investigate three strategies to probe the efficiency and scalability of incompressible flow computations on the Lincoln Tesla cluster at the National Center for Supercomputing Applications (NCSA). We exploit some of the advanced features of MPI and CUDA programming to overlap both GPU data transfer and MPI communications with computations on the GPU. We sustain approximately 2.4 TeraFLOPS on the 64 nodes of the NCSA Lincoln Tesla cluster using 128 GPUs with a total of 30,720 processing elements. Our results demonstrate that multi-GPU clusters can substantially accelerate computational fluid dynamics (CFD) simulations.

部分文件列表（点击文件名可查看文件内容）

					
									本源码包内暂不包含可直接显示的源代码文件，请下载源码包。