EHWPOISON
Linux / POSIXERRORCriticalDeviceHIGH confidence

Memory Page Has Hardware Error

Production Risk

CRITICAL — indicates defective RAM. Replace the faulty DIMM immediately.

What this means

EHWPOISON (errno 133) is returned when the kernel detects that a memory page has an uncorrectable hardware error (typically an uncorrectable ECC memory error) and refuses to access it to prevent data corruption.

Why it happens
  1. 1RAM with an uncorrectable ECC memory error
  2. 2Hardware Memory Error Machine Check Exception (MCE) on the page
  3. 3The page has been "poisoned" by the kernel to prevent use of bad RAM
How to reproduce

read() or mmap access to a page with a hardware memory error.

trigger — this will error
trigger — this will error
// Accessing a page with uncorrectable ECC error
ssize_t n = read(fd, buf, sizeof(buf));
// Returns -1, errno = EHWPOISON if page is hardware-poisoned

expected output

read: Memory page has hardware error (EHWPOISON)

Fix

Replace the faulty RAM

WHEN When EHWPOISON is returned

Replace the faulty RAM
# Check memory error log
mcelog
# Or check dmesg for MCE events
dmesg | grep -i "mce\|hardware error\|poison"
# Run memtest to identify faulty DIMM
# Boot memtest86+ from GRUB or USB

Why this works

EHWPOISON indicates physical RAM failure. The page is permanently unusable until the DIMM is replaced.

What not to do

Retry the operation on a poisoned page

The hardware error is permanent; repeated access will continue to fail until the faulty DIMM is replaced.

Sources

Content generated with AI assistance and reviewed for accuracy. Found an error? hello@errcodes.dev

← All Linux / POSIX errors