Announcement
Collapse
No announcement yet.
List of Motherboards with issues when running MemTest86 in multi-CPU selection modes
Collapse
This is a sticky topic.
X
X
-
Hello,
memtest8 kept getting stuck at "Testing multiprocessor support". My system consists of:
WS-C621E-SAGE
2 x Intel Xeon Gold 6130
I updated BIOS to version 6801 (most recent one at the time of writing) and the problem didn't go away.
After I add "WS-C621E-SAGE Series" to the blacklist to disable the multithreading, the test works fine.Attached Files
Comment
-
Originally posted by csehydrogen View PostHello,
memtest8 kept getting stuck at "Testing multiprocessor support". My system consists of:
WS-C621E-SAGE
2 x Intel Xeon Gold 6130
I updated BIOS to version 6801 (most recent one at the time of writing) and the problem didn't go away.
After I add "WS-C621E-SAGE Series" to the blacklist to disable the multithreading, the test works fine.
Can you give this build a try:
https://www.passmark.com/temp/memtes...-10.1.0010.zip
Comment
-
Originally posted by theregoesplanb View PostUploaded the log from the (manually aborted) run here: https://file.io/CMDuiCIMcCbv
Code:"H12DSi-NT6",ALL,EXACT,RESTRICT_MP
Comment
-
Here's a new link to the log that won't disappear in a few days: https://www.dropbox.com/scl/fi/egzmn...paecegy4f9igat
One concerning thing is I did get ECC errors in MP mode, but I haven't been able to replicate when running in uniprocessor mode. Is there any chance that those were false positives due to MP incompatibility?
You'll probably also want to add H12DSi-N6 to the blacklist as well. It's basically the same board with different onboard NIC.
Comment
-
Recently I ran MemTest86 V10.2 Pro on a system with a Xeon E5-2670, X9SRL-F (listed in startpost) and 80GB's of RAM. After running for about 50 hours, it was only half way through pass 3. Initially I thought something was wrong, I've never had MemTest86 take this long to get through the passes, no matter the amount of RAM.
Since the error count was 0, despite
Note: Your RAM may be vulnerable to high frequency row hammer bit flips. However the conditions needed to induce these errors occur only very rarely in normal PC usage, and so this should not be of concern to most users.
After checking the config, the testing was done on 1 CPU core, which MemTest86 set because this motherboard's BIOS apparently has this bug. After setting parallel instead of single, MemTest86 barely ran a minute or two before the system reset (which I can reproduce), guess that's the bug?
Since MemTest86 was running on only 1 core, could that explain the long runtime?
Comment
-
Motherboard: Super Micro H11DSi-NT
CPU: Dual AMD EPYC 7F52
BIOS Firmware Version: 2.5
No freezing or reboot, the software keeps showing the "[UEFI firmware Error] Could not start CPU X" message.
Super Micro just release a new BIOS version 2.7 recently, so the issue may be fixed?Attached Files
Comment
-
Hello,
Sorry admin for the previous post, I thought I can edit it after it is posted.
My motherboard BIOS FW is now updated to its newest release version 2.7, but the problem is still the same.
My test environment can be sum up as follow
Motherboard: Super Micro H11DSi-NT
CPU: Dual AMD EPYC 7F52
BIOS Firmware Version: 2.7
RAM: 16 x Hynix HMA82GR7CJR8N-XN DDR4 16 GB 3200 MHz RDIMM
Memtest86 Version: V10.6 Free
SMT is disabled in the BIOS
A retired 2U Super Micro Super Chassis with a 700 W hot-swappable power supply
I manually terminated the test after it finish pass #2, and the log file is attached.
As you may see in the attached log file, my current PC build can run the test without any error or warning messages (including the annoying "[UEFI brah brah brah"one) in the 1st pass. However, the "[UEFI firmware Error] Could not start CPU X" message will shows up after the 1st pass is finished.Before the test is aborted, the system shows no memory errors but I get loads of "[UEFI firmware Error] Could not start CPU X" messages during the test.
In fact, when I run the test with BIOS version 2.5, the system can do a 4 pass test with 0 memory error, and again, I just get so many "[UEFI firmware Error] Could not start CPU X" messages at the end of the test as you may see in my previous post.
The IPMI's Health Event Log shows no event of anything fail( I cleared it before the post tested is conducted, because I got loads of fan falling Messages when I pull out the high speed 8cm fans at night)
I also run the AIDA64 test (CPUs, cache & memory) for 3 hours to test the stability of my build. After the test is aborted, I don't have any reported errors from windows or the IPMI's health Event Log.
Am I likely experiencing the same BIOS bug that the other users mentioned in this thread?
Sorry for my broken English, please let me know if there is any other info that I may be able to provide.
Comment
-
Originally posted by mt04340434 View PostHello,
Motherboard: Super Micro H11DSi-NT
CPU: Dual AMD EPYC 7F52
BIOS Firmware Version: 2.7
RAM: 16 x Hynix HMA82GR7CJR8N-XN DDR4 16 GB 3200 MHz RDIMM
Memtest86 Version: V10.6 Free
SMT is disabled in the BIOS
A retired 2U Super Micro Super Chassis with a 700 W hot-swappable power supply
Can you try first disabling ECC polling from the main menu and run the tests again.
Also, we had a report about UEFI firmware issues on a similar chipset but different motherboard on v10.6. In this case, it seems the errors don't appear when running an earlier version of MemTest86 (v8.4).
Can you run MemTest86 v8.4 as well and upload a copy of the logs:
https://www.memtest86.com/downloads/...86-8.4-usb.zip
Comment
-
Originally posted by keith View Post
Thanks for the logs.
Can you try first disabling ECC polling from the main menu and run the tests again.
Also, we had a report about UEFI firmware issues on a similar chipset but different motherboard on v10.6. In this case, it seems the errors don't appear when running an earlier version of MemTest86 (v8.4).
Can you run MemTest86 v8.4 as well and upload a copy of the logs:
https://www.memtest86.com/downloads/...86-8.4-usb.zip
I ran the tests with ECC polling disabled on both v10.6 & v8.4 of Memtest86(1 pass only), and the v10.6 one was manually terminated.
I also ran MemTest86 v8.4 for 4 Passes with default settings, but the size of the log is too large(2.5MB), and I cannot upload it on to the forum.
When I disabled ECC polling on Memtest86 v10.6, the error message will now appear in the 1st pass, but my PC is not crashing / freezing / rebooting at all.
The two cases ran with v8.4 shown no error message at all.
I hope these help.
Comment
-
Originally posted by keith View Post
Thanks for the logs.
Can you try first disabling ECC polling from the main menu and run the tests again.
Also, we had a report about UEFI firmware issues on a similar chipset but different motherboard on v10.6. In this case, it seems the errors don't appear when running an earlier version of MemTest86 (v8.4).
Can you run MemTest86 v8.4 as well and upload a copy of the logs:
https://www.memtest86.com/downloads/...86-8.4-usb.zipAttached Files
Comment
-
Thanks for the logs.
The logs confirm it is likely a UEFI BIOS issue (and not related to ECC polling or an introduced bug in later versions of MemTest86).
Even though the errors don't show up in v8.4, they are present in the logs file. The [UEFI Firmware Error] messages on the screen were introduced in a later version of MemTest86.
Although this is a software bug that should be fixed in the BIOS by the vendor, it likely isn't a critical (hardware) error as MemTest86 makes it out to be. We'll need to revisit how to report such errors in a way where it better represents its severity.
- Likes 1
Comment
Comment