Hello,
In October I built a new R9 7900X based system, with Asus TUF X670E PLUS WIFI motherboard, and it seemed like I was fighting RAM related issues since day 1.
I was using some GSkill Flare X5 5600 RAM which is on the motherboard’s QVL, first signs of issue was when occasionally the memory training stage of POST would fail. The motherboard would reboot itself, and lock the RAM in the default 4800 JEDEC speed until I manually turn EXPO off and on again. Every time this happened, I would run MEMTEST86, and it never detected any errors. Also with EXPO profile enabled, upon entering BIOS, it would freeze for a few seconds before becoming normal. I initially tossed this up to new platform, early BIOS, and early adaptor issues.
At some point, I turned on Memory context Restore, to try to speed up the POST speed by making the motherboard skip memory training when able. Few days after I did this, I encountered my first total OS corruption. It was a boot loop of memory related BSOD, followed by various service failure etc. I ran Memtest86 immediately after this, it still detected no errors.
I figured the EXPO profile was probably not stable, and decided to run the sticks at JEDEC speeds instead until further BIOS improvement could be made, and turned Memory context restore back to normal so the motherboard can proper train the RAM as needed. This was fine and lasted several weeks where it seemed very stable, until yesterday.
Out of the blue, the machine had BSOD related to Memory Management, and entered a very similar BSOD loop similar to the first one. Once again my Windows installation was not recoverable. Given this happened at the stock JEDEC 4800Mhz, I immediately run MEMTEST86 again, and this time it showed me 10k+ errors 3 minutes into the test.
It was all 1 bit errors during Test 6, and occurred over a relatively large address space, with every CPU thread detecting errors.
Figuring one or both of the sticks completely failed, I tried to isolate which stick is still good so I can still reinstall the OS and have a functional PC. To my surprise both sticks passed individually. Then I plugged both sticks back into their original slots, and to my surprise again, both sticks passed all 4 passes this time! I ran additional tests since then but could not reproduce the error. I had several RAM failures in the past 15 years or so I’ve been building PCs and they were all reproducible with relative consistency.
But two OS corruptions later, I can no longer trust the machine. So I got myself another set of DDR5 RAM today, same brand, but rated at 6000Mhz this time, also on motherboard QVL.
I tested the RAM sticks at their EXPO speed immediately after I installed them, they passed all 4 passes no issues. But I figured I want to be extra sure of stability, so I turned off EXPO profile and ran them at JEDEC speed as I prepared for my OS reinstall and data restore process. Before I do that, I figured I will kick off another MEMTEST86 run at JEDEC speed just to be sure they are stable…annnnd…it detected a single error during Test 8 on Pass Forum The error did not reoccur in the subsequent passes and the entire run ended with 1 error detected.
This is another situation that I never ran across, in my past experiences RAM either throw large amount of errors or none at all. Plus the same sticks literally just passed at their higher EXPO speed while overclocked but then had an error when running stock? Given I just swapped the RAM sticks, could this really be faulty RAM or could something else be at play here? Is this just a fluke? Or is my luck really this bad? I really don’t know anymore after all this…
In October I built a new R9 7900X based system, with Asus TUF X670E PLUS WIFI motherboard, and it seemed like I was fighting RAM related issues since day 1.
I was using some GSkill Flare X5 5600 RAM which is on the motherboard’s QVL, first signs of issue was when occasionally the memory training stage of POST would fail. The motherboard would reboot itself, and lock the RAM in the default 4800 JEDEC speed until I manually turn EXPO off and on again. Every time this happened, I would run MEMTEST86, and it never detected any errors. Also with EXPO profile enabled, upon entering BIOS, it would freeze for a few seconds before becoming normal. I initially tossed this up to new platform, early BIOS, and early adaptor issues.
At some point, I turned on Memory context Restore, to try to speed up the POST speed by making the motherboard skip memory training when able. Few days after I did this, I encountered my first total OS corruption. It was a boot loop of memory related BSOD, followed by various service failure etc. I ran Memtest86 immediately after this, it still detected no errors.
I figured the EXPO profile was probably not stable, and decided to run the sticks at JEDEC speeds instead until further BIOS improvement could be made, and turned Memory context restore back to normal so the motherboard can proper train the RAM as needed. This was fine and lasted several weeks where it seemed very stable, until yesterday.
Out of the blue, the machine had BSOD related to Memory Management, and entered a very similar BSOD loop similar to the first one. Once again my Windows installation was not recoverable. Given this happened at the stock JEDEC 4800Mhz, I immediately run MEMTEST86 again, and this time it showed me 10k+ errors 3 minutes into the test.
It was all 1 bit errors during Test 6, and occurred over a relatively large address space, with every CPU thread detecting errors.
Lowest Error Address | 0x121C848 (18MB) |
Highest Error Address | 0x27FBA8A68 (10235MB) |
Bits in Error Mask | 0000000010000000 |
Bits in Error | 1 |
Max Contiguous Errors | 1 |
CPUs that detected memory errors | { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 } |
Test | # Tests Passed | Errors |
Test 0 [Address test, walking ones, 1 CPU] | 1/1 (100%) | 0 |
Test 1 [Address test, own address, 1 CPU] | 1/1 (100%) | 0 |
Test 2 [Address test, own address] | 1/1 (100%) | 0 |
Test 3 [Moving inversions, ones & zeroes] | 1/1 (100%) | 0 |
Test 4 [Moving inversions, 8-bit pattern] | 1/1 (100%) | 0 |
Test 5 [Moving inversions, random pattern] | 1/1 (100%) | 0 |
Test 6 [Block move, 64-byte blocks] | 0/0 (0%) | 10401 |
Test 7 [Moving inversions, 32-bit pattern] | 0/0 (0%) | 0 |
Test 8 [Random number sequence] | 0/0 (0%) | 0 |
Test 9 [Modulo 20, ones & zeros] | 0/0 (0%) | 0 |
Test 10 [Bit fade test, 2 patterns, 1 CPU] | 0/0 (0%) | 0 |
Test 13 [Hammer test] | 0/0 (0%) | 0 |
Last 10 Errors |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 5, Address: 251127048, Expected: FFFEFFFF, Actual: EFFEFFFF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 4, Address: 24CC1AE48, Expected: FFFBFFFF, Actual: EFFBFFFF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 6, Address: 255329148, Expected: FFFF7FFF, Actual: EFFF7FFF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 7, Address: 258B35008, Expected: FF7FFFFF, Actual: EF7FFFFF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 10, Address: 264765488, Expected: FFFFDFFF, Actual: EFFFDFFF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 3, Address: 248A16088, Expected: FFFFFFDF, Actual: EFFFFFDF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 1, Address: 24157EE08, Expected: F7FFFFFF, Actual: E7FFFFFF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 10, Address: 2646E5288, Expected: BFFFFFFF, Actual: AFFFFFFF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 2, Address: 24453D888, Expected: FFFFF7FF, Actual: EFFFF7FF |
2022-12-02 16:48:00 - [Data Error] Test: 6, CPU: 3, Address: 24895EA08, Expected: FFFF7FFF, Actual: EFFF7FFF |
But two OS corruptions later, I can no longer trust the machine. So I got myself another set of DDR5 RAM today, same brand, but rated at 6000Mhz this time, also on motherboard QVL.
I tested the RAM sticks at their EXPO speed immediately after I installed them, they passed all 4 passes no issues. But I figured I want to be extra sure of stability, so I turned off EXPO profile and ran them at JEDEC speed as I prepared for my OS reinstall and data restore process. Before I do that, I figured I will kick off another MEMTEST86 run at JEDEC speed just to be sure they are stable…annnnd…it detected a single error during Test 8 on Pass Forum The error did not reoccur in the subsequent passes and the entire run ended with 1 error detected.
Lowest Error Address | 0x47E698DC8 (18406MB) |
Highest Error Address | 0x47E698DC8 (18406MB) |
Bits in Error Mask | 0000000000000010 |
Bits in Error | 1 |
Max Contiguous Errors | 1 |
CPUs that detected memory errors | { 0 } |
Test | # Tests Passed | Errors |
Test 0 [Address test, walking ones, 1 CPU] | 4/4 (100%) | 0 |
Test 1 [Address test, own address, 1 CPU] | 4/4 (100%) | 0 |
Test 2 [Address test, own address] | 4/4 (100%) | 0 |
Test 3 [Moving inversions, ones & zeroes] | 4/4 (100%) | 0 |
Test 4 [Moving inversions, 8-bit pattern] | 4/4 (100%) | 0 |
Test 5 [Moving inversions, random pattern] | 4/4 (100%) | 0 |
Test 6 [Block move, 64-byte blocks] | 4/4 (100%) | 0 |
Test 7 [Moving inversions, 32-bit pattern] | 4/4 (100%) | 0 |
Test 8 [Random number sequence] | 3/4 (75%) | 1 |
Test 9 [Modulo 20, ones & zeros] | 4/4 (100%) | 0 |
Test 10 [Bit fade test, 2 patterns, 1 CPU] | 4/4 (100%) | 0 |
Test 13 [Hammer test] | 4/4 (100%) | 0 |
Last 10 Errors |
2022-12-03 19:07:18 - [Data Error] Test: 8, CPU: 0, Address: 47E698DC8, Expected: BC64DD7C, Actual: BC64DD6C |
Comment