Implementing a GPU-portable field line tracing application with OpenMP Offload

Jiménez, Diego; Herrera-Mora, Javier; Rampp, Markus; Laure, Erwin; Meneses, Esteban

doi:10.1007/978-3-031-23821-5_3

アイテム詳細

登録内容を編集ファイル形式で保存

一時保存へ追加

タグ情報を表示リリース履歴を表示詳細要約

公開

会議論文

Implementing a GPU-portable field line tracing application with OpenMP Offload

MPS-Authors

/persons/resource/persons110221

Rampp, Markus
Max Planck Computing and Data Facility, Max Planck Society;

/persons/resource/persons264425

Laure, Erwin
Max Planck Computing and Data Facility, Max Planck Society;

External Resource

There are no locators available

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

フルテキスト (公開)

公開されているフルテキストはありません

付随資料 (公開)

There is no public supplementary material available

引用

Jiménez, D., Herrera-Mora, J., Rampp, M., Laure, E., & Meneses, E. (2022). Implementing a GPU-portable field line tracing application with OpenMP Offload. In P., Navaux, C. J., Barrios H., C., Osthoff, & G., Guerrero (Eds.), High Performance Computing: 9th Latin American Conference, CARLA 2022, Porto Alegre, Brazil, September 26–30, 2022, Revised Selected Papers (pp. 31-46). Midtown Manhattan, New York City: Springer Cham. doi:10.1007/978-3-031-23821-5_3.

引用: https://hdl.handle.net/21.11116/0000-000C-1E1D-4

要旨

Accelerated computing is becoming more diverse as new vendors and architectures come into play. Although platform-specific programming models promise ease of development and better control over performance, they still restrict the portability of scientific applications. As the OpenMP offloading specification becomes adopted by more compilers, this programming model stands out as a vendor-neutral portable approach to heterogeneous programming. In this study, we port a plasma physics oriented field line tracing code from a CPU-based MPI+OpenMP approach to a GPU accelerated version, using OpenMP’s offloading capabilities. We analyze GPU performance across different vendors with respect to the original CPU version and test both prescriptive and descriptive approaches to accelerator programming. A maximum 6× acceleration over the CPU implementation was achieved using OpenMP’s high-level offloading directives. In addition, we demonstrate portability across three different vendor GPUs with no code modifications.