An Integrated Compile-Time/Run-Time Software Distributed Shared Memory System

Date
1997-11-17
Journal Title
Journal ISSN
Volume Title
Publisher
Description
Abstract

High Performance Fortran (HPF), as well as its predecessor FortranD,has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and use data layout directives to direct the partitioning of the data and computation among the processors of a parallel machine. For HPF to gain acceptance as a vehicle for parallel scientific programming, it must achieve high performance on problems for which it is well suited. To achieve high performance with an HPF program on a distributed-memory parallel machine, an HPF compiler must do a superb job of translating Fortran90 data-parallel array constructs into an efficient sequence of operations that minimize the overhead associated with data movement and also maximize data locality. This dissertation presents and analyzes a set of advanced optimizations designed to improve the execution performance of HPF programs on distributed-memory architectures. Presented is a methodology for performing deep analysis ofFortran90 programs, eliminating the reliance upon pattern matching to drive the optimizations as is done in many Fortran90 compilers. The optimizations address the overhead of data movement, both interprocessor and intraprocessor movement, that results from the translation of Fortran90 array constructs. Additional optimizations address the issues of scalarizing array assignment statements, loop fusion, and data locality. The combination of these optimizations results in a compiler that is capable of optimizing dense matrix stencil computations more completely than all previous efforts in this area. This work is distinguished by advanced compile-time analysis and optimizations performed at the whole-array level as opposed to analysis and optimization performed at the loop or array-element levels.

Description
Advisor
Degree
Type
Technical report
Keywords
Citation

Cox, Alan, Dwarkadas, Sandhya and Zwaenepoel, Willy. "An Integrated Compile-Time/Run-Time Software Distributed Shared Memory System." (1997) https://hdl.handle.net/1911/96479.

Has part(s)
Forms part of
Published Version
Rights
You are granted permission for the noncommercial reproduction, distribution, display, and performance of this technical report in any format, but this permission is only for a period of forty-five (45) days from the most recent time that you verified that this technical report is still available from the Computer Science Department of Rice University under terms that include this permission. All other rights are reserved by the author(s).
Link to license
Citable link to this page