I'm currently testing a bunch of Supermicro H11DSU-iN configured with EPYC 7502 and 512gb of 3200 ram. No issues on the memory front yet, as I can't get memtest to function correctly in parallel mode. If left to default, memtest intermittently ends up 'Setting default CPU mode to SINGLE' which is far from optimal for my use case. If it doesn't do this, it ends up 'Setting default CPU mode to PARALLEL' and then producing hundreds of MP CPU errors, one of which being 'RunMemoryRangeTest - CPU #52 completed but did not signal...WARNING - possible multiprocessing bug in BIOS'. Forcing parallel via the config does the same thing. I have attached relevant log files in which I found these issues.
This is booting memtest86 10.2 via usb, however, almost every other time we boot via PXE. Though when we try to do this, we get to test 4 and memtest either hangs or repeats 'error reporting to pxe server' and 'The process cannot access the file because it is being used by another process' shows up within serva as well. I've read about certain SFPs causing issues, but I'm booting off of an i350 with no other NICs installed. I'm also wondering if this problem is combined with the MP issue, as maybe its reporting back so frequently it's causing issues.
Releasing the memory after testing is also unusually quick in comparison to what I'd expect; it usually takes between 10-45 seconds rather than 2-3.
I appreciate the fact I am not running 10.5, of which that would include: 'Fixed freeze on some UEFI firmware when attempting to enable main thread during Multi-Processor init' found in the 10.3 update, which may be related to my problems but I'm unsure. I believe we would have access to 10.5, but the password holder for the account is currently on holiday and unreachable during an important job such as this... typical.
Anyhow, thanks for any help in advance.
This is booting memtest86 10.2 via usb, however, almost every other time we boot via PXE. Though when we try to do this, we get to test 4 and memtest either hangs or repeats 'error reporting to pxe server' and 'The process cannot access the file because it is being used by another process' shows up within serva as well. I've read about certain SFPs causing issues, but I'm booting off of an i350 with no other NICs installed. I'm also wondering if this problem is combined with the MP issue, as maybe its reporting back so frequently it's causing issues.
Releasing the memory after testing is also unusually quick in comparison to what I'd expect; it usually takes between 10-45 seconds rather than 2-3.
I appreciate the fact I am not running 10.5, of which that would include: 'Fixed freeze on some UEFI firmware when attempting to enable main thread during Multi-Processor init' found in the 10.3 update, which may be related to my problems but I'm unsure. I believe we would have access to 10.5, but the password holder for the account is currently on holiday and unreachable during an important job such as this... typical.
Anyhow, thanks for any help in advance.
Comment