rebutall-final.pdf (6.96 MB)
Using machine learning techniques to evaluate multicore soft error reliability
journal contribution
posted on 2019-06-06, 10:38 authored by Felipe Rocha da Rosa, Rafael Garibotti, Luciano OstLuciano Ost, Ricardo ReisVirtual platform frameworks have been extended
to allow earlier soft error analysis of more realistic multicore
systems (i.e., real software stacks, state-of-the-art ISAs). The
high observability and simulation performance of underlying
frameworks enable to generate and collect more error/failurerelated data, considering complex software stack configurations,
in a reasonable time. When dealing with sizeable failure-related
data sets obtained from multiple fault campaigns, it is essential to
filter out parameters (i.e., features) without a direct relationship
with the system soft error analysis. In this regard, this paper proposes the use of supervised and unsupervised machine learning
techniques, aiming to eliminate non-relevant information as well
as identify the correlation between fault injection results and
application and platform characteristics. This novel approach
provides engineers with appropriate means that able are able to
investigate new and more efficient fault mitigation techniques.
The underlying approach is validated with an extensive data set
gathered from more than 1.2 million fault injections, comprising
several benchmarks, a Linux OS and parallelization libraries
(e.g., MPI, OpenMP), as well as through a realistic automotive
case study.
History
School
- Mechanical, Electrical and Manufacturing Engineering
Published in
IEEE Transactions on Circuits and Systems I: Regular PapersVolume
66Issue
6Pages
2151 - 2164Citation
DA ROSA, F.R. .... et al., 2019. Using machine learning techniques to evaluate multicore soft error reliability. IEEE Transactions on Circuits and Systems I: Regular Papers, 66(6), pp. 2151 - 2164.Publisher
© Institute of Electrical and Electronics Engineers (IEEE)Version
- AM (Accepted Manuscript)
Acceptance date
2019-03-11Publication date
2019-04-17Notes
Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.ISSN
1549-8328eISSN
1558-0806Publisher version
Language
- en