3 month old PC. Hardware is as follows:
Asus Maximus IX Formula
Intel Core i&-7700k @ 4.20 GHz 4x G Skill 16gb DDR4-3600 F4-3600C17-16GTZKW
2x Asus GeForce GTX 1080 TI 11GB GDDR5X Founders Edition
Primary Drive: Samsung 850 Pro 2.5" 1TB SATA III 3D Internal SSD
Secondary Drive: WD Black 6TB HDD
Corsair Hydro Series H100i V2 CPU cooler
1x NZXT AC-IUSBH-M1 USB Hub
Thermaltake RGB 1250W 80 PLUS TITANIUM Power Supply
3.5" Rosewill 40:1 internal card reader
LB 16x Blu-Ray rewriter
Cooler Master HAF 932 Full Tower
Windows 10 Pro 64-bit
Over the last 2-3 weeks I have been getting random memory-related BSOD. I've reinstalled the OS twice. I've tried removing all peripherals. I tried replacing the RAM with an entirely different set. All produce the same type of BSOD, all within seconds, minutes, or hours of booting up. The more actively I try to load programs when booted, the faster the crashes. If after boot, when the login screen appears, I try to log in right away, it crashes. If I wait 60-120 seconds, it doesnt. Upon getting to the desktop, if I try to load any programs within 60-120 seconds, it crashes. If I wait, generally, it doesn't, until I load something like a big game such as Battlefield 1, BF4, or WoW. Sometimes it will run for a few hours, sometimes not.
After reinstalling the OS from scratch to eliminate the possibility of it being driver related, I figured it must be hardware. I'd seen sporadic reports that my Steelseries Z board could cause problems with my video cards. Tried swapping that out for a plain old usb kb. No luck. No matter what configuration I tried, the failures persist.
Moved into testing prime components, figuring that is the only remaining possibility that I haven't changed. Went out and bought new ram, 4 sticks of Corsair DDR4-3000 RGB Vengeance. Problem persisted.
Downloaded MemTest86 from this site and began testing the original ram. First test was all four sticks in all four slots.
Results were not pleasing. Decided that to best isolate the problem, I would test one stick at a time, in each slot sequentially, for a total of 16 tests. I won't list out the complete results here unless asked to do so, as that is alot of data. However this is the result of the first three sticks:
Test 1: All 4 sticks all 4 slots - failed (#3,4,5,7,9 - 17545)
Test 2: Stick 1 Slot 1 - passed
Test 3: Stick 1 Slot 2 - failed (#7-16)
Test 4: Stick 1 Slot 3 - failed (#4,7-66)
Test 5: Stick 1 Slot 4 - failed (#9-1)
Test 6: Stick 2 slot 1 - failed (#7,9-67)
Test 7: Stick 2 slot 2 - passed
Test 8: Stick 2 slot 3 - failed (#7-16)
Test 9: Stick 2 slot 4 - passed
Test 10: Stick 3 slot 1 - failed (#4,7,9-9
Test 11: Stick 3 slot 2 - failed (#5,7-32)
Test 12: Stick 3 slot 3 - failed (#7-16)
Test 13: Stick 3 slot 4 - failed (#7-16)
At this point I stopped because around test 7 or 8 I started to notice that the reporting CPU was always #2. I went back through all the previous tests logs which I had been saving as HTML files and found that indeed, every test that failed reported CPU: 2.
I see that the tests recognize I have 8 cores, but only 4 are enabled for testing. I'm testing in parallel mode. Does this indicate that my 2nd core is faulty? Or does it just happen to be the only core that reports the failures?
I don't have enough experience with this program to know for sure, so if anyone here does ~ I'd appriciate any thoughts.
My next course of action is to reinstall all 4 sticks of RAM, and disable core #2 for testing, and see if the errors persist. If they do, I suppose I'll have to single core test all 4 sticks simultaniously, then maybe even individually, to try to isolate the problem. I'm really hoping it doesn't come to that. Even using 4 cores a single RAM stick test battery takes 4 hours. All 4 cores with all 4 sticks took 11. I've already been testing for 59 hours. I'm really ready to know exactly what the problem is so I can move out of the testing phase and get into the repair phase.
Thanks for any help you guys can provide.
Asus Maximus IX Formula
Intel Core i&-7700k @ 4.20 GHz 4x G Skill 16gb DDR4-3600 F4-3600C17-16GTZKW
2x Asus GeForce GTX 1080 TI 11GB GDDR5X Founders Edition
Primary Drive: Samsung 850 Pro 2.5" 1TB SATA III 3D Internal SSD
Secondary Drive: WD Black 6TB HDD
Corsair Hydro Series H100i V2 CPU cooler
1x NZXT AC-IUSBH-M1 USB Hub
Thermaltake RGB 1250W 80 PLUS TITANIUM Power Supply
3.5" Rosewill 40:1 internal card reader
LB 16x Blu-Ray rewriter
Cooler Master HAF 932 Full Tower
Windows 10 Pro 64-bit
Over the last 2-3 weeks I have been getting random memory-related BSOD. I've reinstalled the OS twice. I've tried removing all peripherals. I tried replacing the RAM with an entirely different set. All produce the same type of BSOD, all within seconds, minutes, or hours of booting up. The more actively I try to load programs when booted, the faster the crashes. If after boot, when the login screen appears, I try to log in right away, it crashes. If I wait 60-120 seconds, it doesnt. Upon getting to the desktop, if I try to load any programs within 60-120 seconds, it crashes. If I wait, generally, it doesn't, until I load something like a big game such as Battlefield 1, BF4, or WoW. Sometimes it will run for a few hours, sometimes not.
After reinstalling the OS from scratch to eliminate the possibility of it being driver related, I figured it must be hardware. I'd seen sporadic reports that my Steelseries Z board could cause problems with my video cards. Tried swapping that out for a plain old usb kb. No luck. No matter what configuration I tried, the failures persist.
Moved into testing prime components, figuring that is the only remaining possibility that I haven't changed. Went out and bought new ram, 4 sticks of Corsair DDR4-3000 RGB Vengeance. Problem persisted.
Downloaded MemTest86 from this site and began testing the original ram. First test was all four sticks in all four slots.
Test Start Time | 2017-09-24 15:05:01 |
Elapsed Time | 10:57:32 |
Memory Range Tested | 0x0 - 107F000000 (67568MB) |
CPU Selection Mode | Parallel (All CPUs) |
ECC Polling | Enabled |
# Tests Passed | 32/48 (66%) |
Lowest Error Address | 0x1971A0 (1MB) |
Highest Error Address | 0x10700F51BC (67328MB) |
Bits in Error Mask | 00000000FFFFFFFF |
Bits in Error | 32 |
Max Contiguous Errors | 2 |
Test | # Tests Passed | Errors |
Test 0 [Address test, walking ones, 1 CPU] | 4/4 (100%) | 0 |
Test 1 [Address test, own address, 1 CPU] | 4/4 (100%) | 0 |
Test 2 [Address test, own address] | 4/4 (100%) | 0 |
Test 3 [Moving inversions, ones & zeroes] | 2/4 (50%) | 32 |
Test 4 [Moving inversions, 8-bit pattern] | 2/4 (50%) | 64 |
Test 5 [Moving inversions, random pattern] | 0/4 (0%) | 96 |
Test 6 [Block move, 64-byte blocks] | 4/4 (100%) | 0 |
Test 7 [Moving inversions, 32-bit pattern] | 0/4 (0%) | 17176 |
Test 8 [Random number sequence] | 4/4 (100%) | 0 |
Test 9 [Modulo 20, ones & zeros] | 0/4 (0%) | 177 |
Test 10 [Bit fade test, 2 patterns, 1 CPU] | 4/4 (100%) | 0 |
Test 13 [Hammer test] | 4/4 (100%) | 0 |
Last 10 Errors |
[Data Error] Test: 9, CPU: 2, Address: 1051902A2C, Expected: 7A62D708, Actual: 859D28F7 |
[Data Error] Test: 9, CPU: 2, Address: F8338DD30, Expected: 67722560, Actual: 988DDA9F |
[Data Error] Test: 9, CPU: 2, Address: F70EA9B10, Expected: 2EF0E0D0, Actual: D10F1F2F |
[Data Error] Test: 9, CPU: 2, Address: ED08D2C88, Expected: CC2F6370, Actual: 33D09C8F |
[Data Error] Test: 9, CPU: 2, Address: EC111ED34, Expected: 6D56AF43, Actual: 92A950BC |
[Data Error] Test: 9, CPU: 2, Address: EC0077D00, Expected: 6D56AF43, Actual: 92A950BC |
[Data Error] Test: 9, CPU: 2, Address: E43B84CB8, Expected: 2909AA9A, Actual: D6F65565 |
[Data Error] Test: 9, CPU: 2, Address: D80CA09A4, Expected: B61544D6, Actual: 49EABB29 |
[Data Error] Test: 9, CPU: 2, Address: D40B9ED98, Expected: 228E5A95, Actual: DD71A56A |
[Data Error] Test: 9, CPU: 2, Address: D31C2BEB0, Expected: 256EBB70, Actual: DA91448F |
Results were not pleasing. Decided that to best isolate the problem, I would test one stick at a time, in each slot sequentially, for a total of 16 tests. I won't list out the complete results here unless asked to do so, as that is alot of data. However this is the result of the first three sticks:
Test 1: All 4 sticks all 4 slots - failed (#3,4,5,7,9 - 17545)
Test 2: Stick 1 Slot 1 - passed
Test 3: Stick 1 Slot 2 - failed (#7-16)
Test 4: Stick 1 Slot 3 - failed (#4,7-66)
Test 5: Stick 1 Slot 4 - failed (#9-1)
Test 6: Stick 2 slot 1 - failed (#7,9-67)
Test 7: Stick 2 slot 2 - passed
Test 8: Stick 2 slot 3 - failed (#7-16)
Test 9: Stick 2 slot 4 - passed
Test 10: Stick 3 slot 1 - failed (#4,7,9-9
Test 11: Stick 3 slot 2 - failed (#5,7-32)
Test 12: Stick 3 slot 3 - failed (#7-16)
Test 13: Stick 3 slot 4 - failed (#7-16)
At this point I stopped because around test 7 or 8 I started to notice that the reporting CPU was always #2. I went back through all the previous tests logs which I had been saving as HTML files and found that indeed, every test that failed reported CPU: 2.
I see that the tests recognize I have 8 cores, but only 4 are enabled for testing. I'm testing in parallel mode. Does this indicate that my 2nd core is faulty? Or does it just happen to be the only core that reports the failures?
I don't have enough experience with this program to know for sure, so if anyone here does ~ I'd appriciate any thoughts.
My next course of action is to reinstall all 4 sticks of RAM, and disable core #2 for testing, and see if the errors persist. If they do, I suppose I'll have to single core test all 4 sticks simultaniously, then maybe even individually, to try to isolate the problem. I'm really hoping it doesn't come to that. Even using 4 cores a single RAM stick test battery takes 4 hours. All 4 cores with all 4 sticks took 11. I've already been testing for 59 hours. I'm really ready to know exactly what the problem is so I can move out of the testing phase and get into the repair phase.
Thanks for any help you guys can provide.
Comment