I'm using MemTest86 V9.4 Pro Build: 1000 (64-bit) on a Lenovo nx360 server, booted via EFI/grub.
The server has 16 DIMMs, two of which are flagged by the BIOS as bad (stuck bit). I do have reason to believe that two of the DIMMs really are bad, although I am not certain that the BIOS identified the correct ones.
Running Memtest86 (3 passes) results in PASSED for all 16 DIMMs. How is that possible? This is reproducible.
Where do I go from here? Can I trust Memtest86?
The system event log on the server reports:
Memtest reports (truncated for brevity)
Note that the reporting for SPD #12 looks a bit funky; the product ID is truncated.
Result summary
The server has 16 DIMMs, two of which are flagged by the BIOS as bad (stuck bit). I do have reason to believe that two of the DIMMs really are bad, although I am not certain that the BIOS identified the correct ones.
Running Memtest86 (3 passes) results in PASSED for all 16 DIMMs. How is that possible? This is reproducible.
Where do I go from here? Can I trust Memtest86?
The system event log on the server reports:
91 | 05/16/2024 03:11:33 | Memory Device 11 (Memory - DIMM 11): Assertion: Memory Scrub Failed (stuck bit) | 16 (address). IPMB device LUN 0. Channel 0. | 1 |
92 | 05/16/2024 03:11:35 | Memory Device 12 (Memory - DIMM 12): Assertion: Memory Scrub Failed (stuck bit) | 16 (address). IPMB device LUN 0. Channel 0. | 1 |
Memtest reports (truncated for brevity)
Note that the reporting for SPD #12 looks a bit funky; the product ID is truncated.
SPD #11 | 32GB DDR4 ECC PC4-17000 |
Samsung / M386A4G40DM0-CPB / 31E792CA / Channel: 1 Slot: 1 | |
15-15-15-36 / 2134 MHz / 1.2V | |
SPD #12 | 32GB DDR4 ECC PC4-17000 |
Samsung / Ml / 31E788CA / Channel: 0 Slot: 0 | |
15-15-15-36 / 2134 MHz / 1.2V |
Result summary
Test Start Time | 2024-05-16 03:38:24 |
Elapsed Time | 91:05:44 |
Memory Range Tested | 0x0 - 7080000000 (460800MB) |
CPU Selection Mode | Parallel (All CPUs) |
CPU Temperature Min/Max/Ave | 38C/66C/50C |
RAM Temperature Min/Max/Ave | -/-/- |
ECC Polling | Enabled |
# Tests Passed | 42/42 (100%) |
Test | # Tests Passed | Errors |
Test 0 [Address test, walking ones, 1 CPU] | 3/3 (100%) | 0 |
Test 1 [Address test, own address, 1 CPU] | 3/3 (100%) | 0 |
Test 2 [Address test, own address] | 3/3 (100%) | 0 |
Test 3 [Moving inversions, ones & zeroes] | 3/3 (100%) | 0 |
Test 4 [Moving inversions, 8-bit pattern] | 3/3 (100%) | 0 |
Test 5 [Moving inversions, random pattern] | 3/3 (100%) | 0 |
Test 6 [Block move, 64-byte blocks] | 3/3 (100%) | 0 |
Test 7 [Moving inversions, 32-bit pattern] | 3/3 (100%) | 0 |
Test 8 [Random number sequence] | 3/3 (100%) | 0 |
Test 9 [Modulo 20, ones & zeros] | 3/3 (100%) | 0 |
Test 10 [Bit fade test, 2 patterns, 1 CPU] | 3/3 (100%) | 0 |
Test 11 [Random number sequence, 64-bit] | 3/3 (100%) | 0 |
Test 12 [Random number sequence, 128-bit] | 3/3 (100%) | 0 |
Test 13 [Hammer test] | 3/3 (100%) | 0 |
Comment