The effects of cache coherence on the performance of parallel PDE algorithms in multiprocessor systems

Date
1988
Journal Title
Journal ISSN
Volume Title
Publisher
Description
Abstract

The advent of parallel processing systems has resulted in the potential for increased performance over traditional uniprocessor systems. However, while there has been significant advances in developing these systems, designing parallel algorithms to run on them has not kept up with the pace. Although parallel algorithms have been studied in the literature, very little has been done in studying how various architectural features effect the performance of these algorithms. This thesis presents the results of a study conducted to determine how one particular design feature of a parallel processing architecture, cache coherence maintenance, affects the performance of parallel partial differential equations' (PDE) algorithms. A high performance shared-memory multiprocessor architecture with private caches and a single bus or full crossbar interconnection network is assumed. The performance degradation as a result of using a directory based cache coherence protocol is evaluated on specific implementations of three synchronous parallel PDE algorithms (Jacobi's algorithm, red-black successive over-relaxation or SOR and the preconditioned conjugate gradient algorithm or PCG). A trace driven cache simulator determines this degradation. The trace is obtained by symbolically executing the algorithm on the multiprocessor system. Parameters derived to evaluate the performance degradation are used as input into an execution time model used to calculate the time needed to execute one iteration of each algorithm. This facilitates parallel algorithm speedup calculations over the sequential algorithm as well as over the parallel algorithm without cache coherence. The results show that implementing cache coherence degrades the overall performance of the parallel PDE algorithms considered by 10 to 30 percent. Various cache design features such as the cache blocksize, the mapping function and the cache size and the algorithm design feature considered, the PDE grid decomposition strategy, have no appreciable effects on the algorithm performance degradation. In fact, the major factors affecting this degradation are the cache miss ratio, the size of the PDE grid relative to the cache size and the write probability of the parallel algorithm. Finally, for the coherence protocol used in this study, SOR has the best speedup performance, followed by Jacobi's algorithm and PCG, respectively.

Description
Degree
Doctor of Philosophy
Type
Thesis
Keywords
Computer science, Electronics, Electrical engineering, Mathematics
Citation

Johnson, Sandra Kay. "The effects of cache coherence on the performance of parallel PDE algorithms in multiprocessor systems." (1988) Diss., Rice University. https://hdl.handle.net/1911/16158.

Has part(s)
Forms part of
Published Version
Rights
Copyright is held by the author, unless otherwise indicated. Permission to reuse, publish, or reproduce the work beyond the bounds of fair use or other exemptions to copyright law must be obtained from the copyright holder.
Link to license
Citable link to this page