Announcement

Collapse
No announcement yet.

Multiprocessing Errors

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiprocessing Errors

    I'm currently testing a bunch of Supermicro H11DSU-iN configured with EPYC 7502 and 512gb of 3200 ram. No issues on the memory front yet, as I can't get memtest to function correctly in parallel mode. If left to default, memtest intermittently ends up 'Setting default CPU mode to SINGLE' which is far from optimal for my use case. If it doesn't do this, it ends up 'Setting default CPU mode to PARALLEL' and then producing hundreds of MP CPU errors, one of which being 'RunMemoryRangeTest - CPU #52 completed but did not signal...WARNING - possible multiprocessing bug in BIOS'. Forcing parallel via the config does the same thing. I have attached relevant log files in which I found these issues.

    This is booting memtest86 10.2 via usb, however, almost every other time we boot via PXE. Though when we try to do this, we get to test 4 and memtest either hangs or repeats 'error reporting to pxe server' and 'The process cannot access the file because it is being used by another process' shows up within serva as well. I've read about certain SFPs causing issues, but I'm booting off of an i350 with no other NICs installed. I'm also wondering if this problem is combined with the MP issue, as maybe its reporting back so frequently it's causing issues.

    Releasing the memory after testing is also unusually quick in comparison to what I'd expect; it usually takes between 10-45 seconds rather than 2-3.

    I appreciate the fact I am not running 10.5, of which that would include: 'Fixed freeze on some UEFI firmware when attempting to enable main thread during Multi-Processor init' found in the 10.3 update, which may be related to my problems but I'm unsure. I believe we would have access to 10.5, but the password holder for the account is currently on holiday and unreachable during an important job such as this... typical.

    Anyhow, thanks for any help in advance. ​​
    Attached Files

  • #2
    There are a bunch of Supermicro boards with UEFI multi-processing bugs.

    So you can try reporting the issue to Supermicro. Otherwise you might need to run single thread (which isn't optimal of course for memory testing). As you mentioned, upgrading to the latest release would also be a good idea. Then see if that fixes PXE issue or not.

    You can also use BurnInTest for RAM testing inside Windows as well.

    Comment

    Working...
    X