Announcement

Collapse
No announcement yet.

List of Motherboards with issues when running MemTest86 in multi-CPU selection modes

Collapse
This is a sticky topic.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • memetic
    replied
    • MB Make: ASRockRack
    • MB Model: C3758D4I-4L (https://www.asrockrack.com/general/p...Specifications)
    • MB BIOS Firmware Version: P1.60 (latest as of writing this post)
    • MB CPU: Intel(R) Atom(TM) CPU C3758 @ 2.20GHz (8-core/thread)
    • Summary of the issue you are experiencing:
    MemTest86 always freezes at 16% - Test 2 (own address), every time.

    Based on other posts, I switched to testing using single CPU core only, and this seems to have fixed it.

    Please add this MB to the multi-CPU blacklist.

    Leave a comment:


  • keith
    replied
    Thanks for the logs.

    The logs confirm it is likely a UEFI BIOS issue (and not related to ECC polling or an introduced bug in later versions of MemTest86).
    Even though the errors don't show up in v8.4, they are present in the logs file. The [UEFI Firmware Error] messages on the screen were introduced in a later version of MemTest86.

    Although this is a software bug that should be fixed in the BIOS by the vendor, it likely isn't a critical (hardware) error as MemTest86 makes it out to be. We'll need to revisit how to report such errors in a way where it better represents its severity.

    Leave a comment:


  • mt04340434
    replied
    Originally posted by keith View Post

    Thanks for the logs.

    Can you try first disabling ECC polling from the main menu and run the tests again.

    Also, we had a report about UEFI firmware issues on a similar chipset but different motherboard on v10.6. In this case, it seems the errors don't appear when running an earlier version of MemTest86 (v8.4).

    Can you run MemTest86 v8.4 as well and upload a copy of the logs:
    https://www.memtest86.com/downloads/...86-8.4-usb.zip
    The compressed log for the MemTest86 v8.4 test with default settings is now also uploaded for your review.
    Attached Files

    Leave a comment:


  • mt04340434
    replied
    Originally posted by keith View Post

    Thanks for the logs.

    Can you try first disabling ECC polling from the main menu and run the tests again.

    Also, we had a report about UEFI firmware issues on a similar chipset but different motherboard on v10.6. In this case, it seems the errors don't appear when running an earlier version of MemTest86 (v8.4).

    Can you run MemTest86 v8.4 as well and upload a copy of the logs:
    https://www.memtest86.com/downloads/...86-8.4-usb.zip
    Here you go!

    I ran the tests with ECC polling disabled on both v10.6 & v8.4 of Memtest86(1 pass only), and the v10.6 one was manually terminated.

    I also ran MemTest86 v8.4 for 4 Passes with default settings, but the size of the log is too large(2.5MB), and I cannot upload it on to the forum.

    When I disabled ECC polling on Memtest86 v10.6, the error message will now appear in the 1st pass, but my PC is not crashing / freezing / rebooting at all.

    The two cases ran with v8.4 shown no error message at all.

    I hope these help.
    Attached Files

    Leave a comment:


  • keith
    replied
    Originally posted by mt04340434 View Post
    Hello,
    Motherboard: Super Micro H11DSi-NT
    CPU: Dual AMD EPYC 7F52
    BIOS Firmware Version: 2.7​
    RAM: 16 x Hynix HMA82GR7CJR8N-XN DDR4 16 GB 3200 MHz RDIMM
    Memtest86 Version: V10.6 Free
    SMT is disabled in the BIOS
    A retired 2U Super Micro Super Chassis with a 700 W hot-swappable power supply
    Thanks for the logs.

    Can you try first disabling ECC polling from the main menu and run the tests again.

    Also, we had a report about UEFI firmware issues on a similar chipset but different motherboard on v10.6. In this case, it seems the errors don't appear when running an earlier version of MemTest86 (v8.4).

    Can you run MemTest86 v8.4 as well and upload a copy of the logs:
    https://www.memtest86.com/downloads/...86-8.4-usb.zip

    Leave a comment:


  • mt04340434
    replied
    Hello,

    Sorry admin for the previous post, I thought I can edit it after it is posted.

    My motherboard BIOS FW is now updated to its newest release version 2.7, but the problem is still the same.

    My test environment can be sum up as follow

    Motherboard: Super Micro H11DSi-NT
    CPU: Dual AMD EPYC 7F52
    BIOS Firmware Version: 2.7​
    RAM: 16 x Hynix HMA82GR7CJR8N-XN DDR4 16 GB 3200 MHz RDIMM
    Memtest86 Version: V10.6 Free
    SMT is disabled in the BIOS
    A retired 2U Super Micro Super Chassis with a 700 W hot-swappable power supply

    I manually terminated the test after it finish pass #2, and the log file is attached.

    As you may see in the attached log file, my current PC build can run the test without any error or warning messages (including the annoying "[UEFI brah brah brah"one) in the 1st pass. However, the "[UEFI firmware Error] Could not start CPU X" message will shows up after the 1st pass is finished.Before the test is aborted, the system shows no memory errors but I get loads of "[UEFI firmware Error] Could not start CPU X" messages during the test.

    In fact, when I run the test with BIOS version 2.5, the system can do a 4 pass test with 0 memory error, and again, I just get so many "[UEFI firmware Error] Could not start CPU X" messages at the end of the test as you may see in my previous post.

    The IPMI's Health Event Log shows no event of anything fail( I cleared it before the post tested is conducted, because I got loads of fan falling Messages when I pull out the high speed 8cm fans at night)

    I also run the AIDA64 test (CPUs, cache & memory) for 3 hours to test the stability of my build. After the test is aborted, I don't have any reported errors from windows or the IPMI's health Event Log.

    Am I likely experiencing the same BIOS bug that the other users mentioned in this thread?

    Sorry for my broken English, please let me know if there is any other info that I may be able to provide.
    Attached Files

    Leave a comment:


  • mt04340434
    replied
    Motherboard: Super Micro H11DSi-NT
    CPU: Dual AMD EPYC 7F52
    BIOS Firmware Version: 2.5


    No freezing or reboot, the software keeps showing the "[UEFI firmware Error] Could not start CPU X" message.

    Super Micro just release a new BIOS version 2.7 recently, so the issue may be fixed?
    Attached Files

    Leave a comment:


  • David (PassMark)
    replied
    ECC RAM is generally a bit slower that non-ECC RAM. Maybe 10% to 20%.
    But running on 1 CPU Core can be a lot slower. Maybe 4x slower than multi-core.

    Supermicro don't care much about their BIOS bugs. More interested in selling you a new motherboard than doing customer support.

    Leave a comment:


  • bartgrefte
    replied
    Recently I ran MemTest86 V10.2 Pro on a system with a Xeon E5-2670, X9SRL-F (listed in startpost) and 80GB's of RAM. After running for about 50 hours, it was only half way through pass 3. Initially I thought something was wrong, I've never had MemTest86 take this long to get through the passes, no matter the amount of RAM.

    Since the error count was 0, despite
    Note: Your RAM may be vulnerable to high frequency row hammer bit flips. However the conditions needed to induce these errors occur only very rarely in normal PC usage, and so this should not be of concern to most users.
    I started thinking maybe it was because I've been testing ECC-RAM (instead of non-ECC) , but could not find any relation between ECC and longer runtimes.

    After checking the config, the testing was done on 1 CPU core, which MemTest86 set because this motherboard's BIOS apparently has this bug. After setting parallel instead of single, MemTest86 barely ran a minute or two before the system reset (which I can reproduce), guess that's the bug?

    Since MemTest86 was running on only 1 core, could that explain the long runtime?

    Leave a comment:


  • theregoesplanb
    replied
    Here's a new link to the log that won't disappear in a few days: https://www.dropbox.com/scl/fi/egzmn...paecegy4f9igat

    One concerning thing is I did get ECC errors in MP mode, but I haven't been able to replicate when running in uniprocessor mode. Is there any chance that those were false positives due to MP incompatibility?

    You'll probably also want to add H12DSi-N6 to the blacklist as well. It's basically the same board with different onboard NIC.

    Leave a comment:


  • keith
    replied
    Originally posted by theregoesplanb View Post
    Uploaded the log from the (manually aborted) run here: https://file.io/CMDuiCIMcCbv
    Thanks for letting us know. The link is no longer available but we've updated the blacklist.cfg file with the following:

    Code:
    "H12DSi-NT6",ALL,EXACT,RESTRICT_MP

    Leave a comment:


  • theregoesplanb
    replied
    Uploaded the log from the (manually aborted) run here: https://file.io/CMDuiCIMcCbv

    Leave a comment:


  • theregoesplanb
    replied
    Motherboard: Super Micro H12DSi-NT6
    CPU: Dual AMD EPYC 7642 48 core
    BIOS Firmware Version: 2.6
    BIOS Build Time: 4/13/2023

    Having issues with a Super Micro board that's not in the blacklist. See attached screenshot.

    Click image for larger version

Name:	memtest.png
Views:	435
Size:	364.5 KB
ID:	55338

    Leave a comment:


  • keith
    replied
    Originally posted by csehydrogen View Post
    Hello,
    memtest8 kept getting stuck at "Testing multiprocessor support". My system consists of:

    WS-C621E-SAGE
    2 x Intel Xeon Gold 6130

    I updated BIOS to version 6801 (most recent one at the time of writing) and the problem didn't go away.
    After I add "WS-C621E-SAGE Series" to the blacklist to disable the multithreading, the test works fine.
    Thanks for the logs.

    Can you give this build a try:
    https://www.passmark.com/temp/memtes...-10.1.0010.zip

    Leave a comment:


  • csehydrogen
    replied
    Hello,
    memtest8 kept getting stuck at "Testing multiprocessor support". My system consists of:

    WS-C621E-SAGE
    2 x Intel Xeon Gold 6130

    I updated BIOS to version 6801 (most recent one at the time of writing) and the problem didn't go away.
    After I add "WS-C621E-SAGE Series" to the blacklist to disable the multithreading, the test works fine.
    Attached Files

    Leave a comment:

Working...
X