High level modeling and mitigation of transient errors in nano-scale systems

Title:
High level modeling and mitigation of transient errors in nano-scale systems
Creator:
Shazli, Syed Zafar (Author)
Contributor:
Tahoori, Mehdi B. (Advisor)
Kaeli, David R. (Committee member)
Mi, Ningfang (Committee member)
Publisher:
Boston, Massachusetts : Northeastern University, 2011
Copyright date:
2011
Date Accepted:
January 2011
Date Awarded:
May 2011
Type of resource:
Text
Genre:
Dissertations
Format:
electronic
Digital origin:
born digital
Abstract/Description:
Soft errors, due to cosmic radiations, are a major reliability barrier for VLSI designs. The vulnerability of such systems to soft errors grows exponentially with technology scaling. To meet reliability constraints in a cost-effective way, it is critical to assess soft error reliability parameters in early design stages in order to optimize reliability in the entire design cycle. Unlike soft error modeling for gate-level netlists, soft error propagation models for high level behavioral designs are not straightforward. We divide the work done into three parts. First, the Soft Error Rate (SER) computation problem is modeled as a Boolean Satisfiability (SAT) problem and SAT solvers are used to compute SER for combinational and sequential circuits. SAT is also used to compute a metric called Hardware Vulnerability Factor (HVF). HVF is the probability that an error in any bit of the internal processor structure will result in an error in a program visible state. The HVF computation problem is transformed into an equivalent Boolean satisfiability problem and state-of-the-art SAT solvers are used to obtain HVF for a 5-stage MIPS pipeline. Next, several schemes are proposed for detecting, correcting and recovering from soft errors in processor pipelines. Two types of pipelines are considered. One is a simple 5-stage MIPS pipeline, while the other is a superscalar pipeline similar to the ALPHA processor. Lastly, a case study involving thousands of high availability systems is presented. The study considers, soft errors occurring in the processors used in these systems.
Subjects and keywords:
Error Recovery
Field Analysis
Online Error Detection
SER Estimation
Soft Errors
Transient Errors
Electrical and Computer Engineering
Engineering
DOI:
https://doi.org/10.17760/d20002793
Permanent Link:
http://hdl.handle.net/2047/d20002793
Use and reproduction:
In Copyright: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the right-holder(s). (http://rightsstatements.org/vocab/InC/1.0/)
Copyright restrictions may apply.

Downloads