NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Dynamic remapping decisions in multi-phase parallel computationsThe effectiveness of any given mapping of workload to processors in a parallel system is dependent on the stochastic behavior of the workload. Program behavior is often characterized by a sequence of phases, with phase changes occurring unpredictably. During a phase, the behavior is fairly stable, but may become quite different during the next phase. Thus a workload assignment generated for one phase may hinder performance during the next phase. We consider the problem of deciding whether to remap a paralled computation in the face of uncertainty in remapping's utility. Fundamentally, it is necessary to balance the expected remapping performance gain against the delay cost of remapping. This paper treats this problem formally by constructing a probabilistic model of a computation with at most two phases. We use stochastic dynamic programming to show that the remapping decision policy which minimizes the expected running time of the computation has an extremely simple structure: the optimal decision at any step is followed by comparing the probability of remapping gain against a threshold. This theoretical result stresses the importance of detecting a phase change, and assessing the possibility of gain from remapping. We also empirically study the sensitivity of optimal performance to imprecise decision threshold. Under a wide range of model parameter values, we find nearly optimal performance if remapping is chosen simply when the gain probability is high. These results strongly suggest that except in extreme cases, the remapping decision problem is essentially that of dynamically determining whether gain can be achieved by remapping after a phase change; precise quantification of the decision model parameters is not necessary.
Document ID
19870001286
Acquisition Source
Legacy CDMS
Document Type
Preprint (Draft being sent to journal)
Authors
Nicol, D. M.
(NASA Langley Research Center Hampton, VA, United States)
Reynolds, P. F., Jr.
(NASA Langley Research Center Hampton, VA, United States)
Date Acquired
September 5, 2013
Publication Date
September 1, 1986
Subject Category
Computer Programming And Software
Report/Patent Number
NAS 1.26:178174
ICASE-86-58
NASA-CR-178174
Accession Number
87N10719
Funding Number(s)
CONTRACT_GRANT: NAS1-18107
PROJECT: RTOP 505-90-21-01
CONTRACT_GRANT: NAS1-17070
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
No Preview Available