ISO 26262 Hardware Metrics Demystified: A Practical Guide for Design Engineers

Thomas AubertJanuary 20, 20269 min

ISO 26262, the functional safety standard for road vehicles, introduces hardware safety metrics that strike fear into the hearts of design engineers. Single Point Fault Metric (SPFM), Latent Fault Metric (LFM), and Probabilistic Metric for random Hardware Failures (PMHF) — these acronyms represent quantitative requirements that your hardware design must meet, and the path to compliance is rarely straightforward.

This guide explains what each metric actually measures, why it matters, and how to design your hardware to meet the targets.

The Big Picture: Why Hardware Metrics Exist

ISO 26262 is built on the principle that safety-critical electronic systems must be designed to either prevent dangerous failures or detect and mitigate them. The hardware metrics quantify how well your design achieves these goals.

The fundamental concept is the "safety mechanism" — any design feature that detects, prevents, or mitigates a fault. A watchdog timer is a safety mechanism. A redundant sensor is a safety mechanism. A plausibility check in firmware is a safety mechanism. The hardware metrics measure the coverage of your safety mechanisms against different categories of faults.

Single Point Fault Metric (SPFM)

The SPFM measures the coverage of your design against "single point faults" — faults in a single hardware element that can directly cause a safety-relevant failure without being detected by any safety mechanism.

The formula: SPFM = 1 - (Σ λ_SPF) / (Σ λ_safety-related)

Where λ_SPF is the failure rate of each single point fault and λ_safety-related is the total failure rate of all safety-related hardware elements.

In plain language: What fraction of your safety-related failure modes are either not dangerous (safe faults) or are covered by a safety mechanism?

ASIL targets:
- ASIL B: ≥ 90%
- ASIL C: ≥ 97%
- ASIL D: ≥ 99%

Practical Design Strategies for SPFM

The most direct way to improve SPFM is to add safety mechanisms that detect single point faults. Common approaches include:

Redundancy: Duplicate critical sensors and compare their outputs. If they disagree, a fault is detected. This is expensive in terms of BOM cost and PCB area, but provides high diagnostic coverage (typically 90-99%).

Built-in self-test (BIST): Design the hardware to test itself. ADC self-test modes, memory BIST, and logic BIST can detect faults without redundant hardware. Coverage varies widely (60-99%) depending on the test's thoroughness.

Monitoring circuits: Add dedicated monitoring hardware that checks the output of safety-critical circuits. Voltage monitors, current monitors, and window comparators can detect out-of-range conditions with high reliability.

Firmware plausibility checks: Use firmware to check whether hardware outputs are physically plausible. A temperature sensor reading -50°C in a car engine compartment is clearly faulty. These checks are inexpensive to implement but their coverage credit under ISO 26262 requires careful justification.

Latent Fault Metric (LFM)

The LFM addresses a subtler problem: faults that don't cause an immediate failure but disable a safety mechanism. If a redundant sensor fails silently, the system continues to operate normally — but if the primary sensor also fails, there is now no backup. The redundant sensor's failure is "latent" because it lurks undetected until a second fault makes it dangerous.

The formula: LFM = 1 - (Σ λ_latent) / (Σ λ_safety-related - Σ λ_safe)

In plain language: What fraction of your non-safe failure modes are either detected as single-point faults or detectable through periodic testing?

ASIL targets:
- ASIL B: ≥ 60%
- ASIL C: ≥ 80%
- ASIL D: ≥ 90%

Practical Design Strategies for LFM

Latent faults are harder to address than single-point faults because they require you to test the safety mechanisms themselves.

Periodic online tests: Run diagnostic tests during normal operation that verify safety mechanism functionality. For example, periodically inject a known signal into a redundant sensor path to verify that the comparison logic is working.

Power-on self-test (POST): Test safety mechanisms at every system startup. This provides a guaranteed test interval equal to the time between power cycles.

Cross-monitoring: Design safety mechanisms to monitor each other. If safety mechanism A monitors component X, add a check that periodically verifies safety mechanism A itself is operational.

Probabilistic Metric for Random Hardware Failures (PMHF)

The PMHF takes a different approach from SPFM and LFM. Instead of measuring coverage percentages, it calculates the actual probability of a dangerous failure occurring over the vehicle's lifetime.

In plain language: How many times per billion hours of operation will your system experience a dangerous failure that is not detected by any safety mechanism?

ASIL targets:
- ASIL B: < 10^-7 / hour
- ASIL C: < 10^-7 / hour
- ASIL D: < 10^-8 / hour

The PMHF calculation requires detailed failure rate data for every component in the safety-relevant hardware. This data comes from reliability databases (SN 29500, IEC TR 62380, MIL-HDBK-217) and must be applied at the failure mode level — not just the component level.

The Traceability Imperative

Achieving these metrics is an engineering challenge. Proving that you've achieved them is a documentation challenge. The ISO 26262 assessor will want to see:

- A complete list of all safety-related hardware elements
- Failure mode and effects analysis (FMEA) for each element
- Identification of every safety mechanism and its diagnostic coverage
- Quantitative calculations for SPFM, LFM, and PMHF
- Traceability from safety requirements to safety mechanisms to hardware design features

This traceability chain must be maintained throughout the development process and updated every time the hardware design changes. In a design with hundreds of components and dozens of safety mechanisms, manual traceability is impractical. Graph-based traceability platforms that automatically maintain the relationships between safety requirements, safety mechanisms, and hardware elements are becoming essential tools for ISO 26262 compliance.

Building Safety In, Not Bolting It On

The most important lesson of ISO 26262 hardware metrics is that safety must be designed in from the beginning, not analyzed after the fact. Teams that design their hardware first and then calculate the metrics invariably discover gaps that require expensive redesigns.

Instead, use the metric targets as design constraints from day one. When selecting a microcontroller, evaluate its self-test capabilities. When designing a sensor interface, plan the diagnostic coverage from the schematic stage. When laying out the PCB, consider the testability of safety mechanisms.

Safety is not an obstacle to innovation. It is a design discipline that, when practiced well, produces better products.