Scheduling for Locality in Shared-Memory Multiprocessors

Markatos, Evangelos

UR Research > Computer Science Department > CS Systems Technical Reports >

Scheduling for Locality in Shared-Memory Multiprocessors

URL to cite or link to: http://hdl.handle.net/1802/824

93.tr457.scheduling_for_locality_in_shared_memory_multiproces.ps 1.09 MB (No. of downloads : 125)

Description

Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 1993. Simultaneously published in the Technical Report series.

Abstract

The last decade has produced enormous improvements in processor speed without a corresponding improvement in bus or interconnection network speeds. As a result, the relative costs of communication and computation in shared-memory multiprocessors have changed dramatically. An important consequence of this trend is that many parallel applications, whose performance depends on a delicate balance between the cost of communication and computation, do not execute efficiently on today's shared-memory multiprocessors. In this dissertation we quantify the effect of this trend in multiprocessor architecture on parallel program performance, explain the implications of this trend on popular parallel programming models, and propose system software to efficiently map parallel programs and programming models to modern shared-memory multiprocessors. Our experiments with application programs on bus-based, cache-coherent machines like the Sequent Symmetry, and large-scale distributed-memory machines like the BBN Butterfly, confirm that applications scale much better on previous-generation machines than on current machines due to the rising cost of communication. Our experiments also suggest that shared-memory programming models, which can be implemented efficiently on the machines of yesterday, do not readily port to state-of-the-art machines. As a solution, we propose new decomposition and scheduling algorithms that significantly reduce communication overhead. Our scheduling algorithms, which apply equally well to run-time libraries and parallelizing compilers, attempt to co-locate processes and data, assigning processes to processors based on the location of the date they will access. Our experiments over a wide variety of shared-memory multiprocessors demonstrate that the performance benefits of these scheduling-for-locality algorithms are significant, improving performance by up to 60\% for some applications on modern machines. We conclude that communication overhead need not dominate performance on present or future multiprocessors, given an appropriate programming model, multiprogramming scheduling policy, and user-level decomposition and scheduling algorithms.

Contributor(s):

Evangelos P. Markatos - Author

Thomas J. LeBlanc - Thesis Advisor

Primary Item Type:

Technical Report

Secondary Item Type(s):

Thesis

Series/Report Number:

UR CSD / TR457

Language:

English

Subject Keywords:

shared-memory multiprocessors;multiprogramming;architecture trends;loop scheduling;lightweight thread scheduling

First presented to the public:

5/1993

Original Publication Date:

5/1993

Previously Published By:

University of Rochester. Computer Science Department.

Citation:

License Grantor / Date Granted:

Suzanne S. Bell / 2009-08-01 13:03:13.872 ( View License )

Date Deposited

2009-08-01 13:03:13.868

Date Last Updated

2012-09-26 16:35:14.586719

Submitter:

Suzanne S. Bell

All Versions

Thumbnail	Name	Version	Created Date
	Scheduling for Locality in Shared-Memory Multiprocessors	1	2009-08-01 13:03:13.868

Reason for withdraw :*
Display metadata:
Withdraw all versions: