Hello,
My recently-upgraded system started exhibiting behavior I've recognized in the past as RAM errors, with programs closing/hanging unexpectedly and randomly, as well as strange OS behavior while trying to boot to MemTest.
Sure enough, when I came back later in the day I found that MemTest had found errors. A lot of them. However, the results don't quite make sense to me; I'm used to the original MemTest86+ version, which would list the details of the errors as they occurred. This version of MemTest says there were 301 total errors and that only 54% of the tests passed, but showed only a single error's details. It wasn't even the last error, because more happened during the final hammer test.
In passes 1 and 2, the system warned that the RAM "may be vulnerable to high frequency row hammer bit flips" but did not do so for passes 3 and 4, which signified to me that it's not just a hammer test vulnerability, but that something really is wrong with the sticks.
The system is currently using four Corsair 16GB DDR4 sticks, all recently bought. However, when I checked detailed SPD information, the stick in DIMM Slot #1 was not reporting correctly; it reported as 16GB DDR4 PC4-17000 and 15-15-15-36 / 2134 MHz / 1.2V while the other three reported as 16GB DDR4 XMP PC4-25600 and 16-18-18-36 / 3200 MHz / 1.350V. The misreporting stick also did not identify itself as a Corsair product, or ... anything, really. When I rebooted to double check the UEFI, all four sticks reported correctly, and when I booted back into MemTest the misreporting stick started reporting correctly as well.
So ... what's going on? Could the misreporting stick be the one having RAM issues? Why did MemTest report the details of only one error when it had detected 301?
Here's the log file in its entirety.
Summary
System Information
Result summary
My recently-upgraded system started exhibiting behavior I've recognized in the past as RAM errors, with programs closing/hanging unexpectedly and randomly, as well as strange OS behavior while trying to boot to MemTest.
Sure enough, when I came back later in the day I found that MemTest had found errors. A lot of them. However, the results don't quite make sense to me; I'm used to the original MemTest86+ version, which would list the details of the errors as they occurred. This version of MemTest says there were 301 total errors and that only 54% of the tests passed, but showed only a single error's details. It wasn't even the last error, because more happened during the final hammer test.
In passes 1 and 2, the system warned that the RAM "may be vulnerable to high frequency row hammer bit flips" but did not do so for passes 3 and 4, which signified to me that it's not just a hammer test vulnerability, but that something really is wrong with the sticks.
The system is currently using four Corsair 16GB DDR4 sticks, all recently bought. However, when I checked detailed SPD information, the stick in DIMM Slot #1 was not reporting correctly; it reported as 16GB DDR4 PC4-17000 and 15-15-15-36 / 2134 MHz / 1.2V while the other three reported as 16GB DDR4 XMP PC4-25600 and 16-18-18-36 / 3200 MHz / 1.350V. The misreporting stick also did not identify itself as a Corsair product, or ... anything, really. When I rebooted to double check the UEFI, all four sticks reported correctly, and when I booted back into MemTest the misreporting stick started reporting correctly as well.
So ... what's going on? Could the misreporting stick be the one having RAM issues? Why did MemTest report the details of only one error when it had detected 301?
Here's the log file in its entirety.
Summary
Report Date | 2019-09-28 20:55:53 |
Generated by | MemTest86 V8.2 Free (64-bit) |
Result | FAIL |
EFI Specifications | 2.40 |
System | |
Manufacturer | ASUS |
Product Name | All Series |
Version | System Version |
Serial Number | System Serial Number |
BIOS | |
Vendor | American Megatrends Inc. |
Version | 3902 |
Release Date | 04/19/2018 |
Baseboard | |
Manufacturer | ASUSTeK COMPUTER INC. |
Product Name | SABERTOOTH X99 |
Version | Rev 1.xx |
Serial Number | 170294965100026 |
CPU Type | Intel Core i7-6950X @ 3.00GHz |
CPU Clock | 2998 MHz [Turbo: 3398.0 MHz] |
# Logical Processors | 20 (10 enabled for testing) |
L1 Cache | 8 x 64K (148781 MB/s) |
L2 Cache | 8 x 256K (47185 MB/s) |
L3 Cache | 25600K (29024 MB/s) |
Memory | 65456M (13509 MB/s) |
DIMM Slot #0 | 16GB DDR4 XMP PC4-25600 |
Corsair / CMR32GX4M2C3200C16 | |
16-18-18-36 / 3200 MHz / 1.350V | |
DIMM Slot #1 | 16GB DDR4 PC4-17000 |
15-15-15-36 / 2134 MHz / 1.2V | |
DIMM Slot #2 | 16GB DDR4 XMP PC4-25600 |
Corsair / CMR32GX4M2C3200C16 | |
16-18-18-36 / 3200 MHz / 1.350V | |
DIMM Slot #3 | 16GB DDR4 XMP PC4-25600 |
Corsair / CMR32GX4M2C3200C16 | |
16-18-18-36 / 3200 MHz / 1.350V |
Test Start Time | 2019-09-28 09:29:53 |
Elapsed Time | 11:15:30 |
Memory Range Tested | 0x0 - 1040000000 (66560MB) |
CPU Selection Mode | Parallel (All CPUs) |
ECC Polling | Enabled |
# Tests Passed | 26/48 (54%) |
Lowest Error Address | 0xB4F94BCB8 (46329MB) |
Highest Error Address | 0xB4F94BCB8 (46329MB) |
Bits in Error Mask | 0000000004000000 |
Bits in Error | 1 |
Max Contiguous Errors | 1 |
Test | # Tests Passed | Errors |
Test 0 [Address test, walking ones, 1 CPU] | 4/4 (100%) | 0 |
Test 1 [Address test, own address, 1 CPU] | 4/4 (100%) | 0 |
Test 2 [Address test, own address] | 4/4 (100%) | 0 |
Test 3 [Moving inversions, ones & zeroes] | 3/4 (75%) | 6 |
Test 4 [Moving inversions, 8-bit pattern] | 2/4 (50%) | 19 |
Test 5 [Moving inversions, random pattern] | 1/4 (25%) | 27 |
Test 6 [Block move, 64-byte blocks] | 4/4 (100%) | 0 |
Test 7 [Moving inversions, 32-bit pattern] | 0/4 (0%) | 112 |
Test 8 [Random number sequence] | 0/4 (0%) | 21 |
Test 9 [Modulo 20, ones & zeros] | 4/4 (100%) | 0 |
Test 10 [Bit fade test, 2 patterns, 1 CPU] | 0/4 (0%) | 4 |
Test 13 [Hammer test] | 0/4 (0%) | 112 |
Last 10 Errors |
2019-09-28 09:32:36 - [Data Error] Test: 4, CPU: 0, Address: B4F94BCB8, Expected: 80808080, Actual: 84808080 |
Comment