Announcement

Collapse
No announcement yet.

MemTest86 v6.0 Beta (2015-02-13 Update - Beta testing is now closed)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • nobody101
    replied
    I'm running 6.0 Beta on an FBDIMM based Mac Pro and always get a correctable ECC error during test 0. I've tried about 10 different flavors of memory (Hynix, Samsung, Micron, different densities, etc) and they all show this same initial ECC error. All of the modules will complete the remaining tests (didn't try hammer) without issue. So, this may be a false failure. See attached.Click image for larger version

Name:	IMG_3635.jpg
Views:	1
Size:	44.5 KB
ID:	34898

    Leave a comment:


  • MoKeiichi
    replied
    All the whitespace is in the original file. I've also run the other tests, except for hammering, which I aborted after 20h. As this machine is now prepared for production use, I cannot make any more tests on this machine, but I will probably have another one in a few weeks.

    Leave a comment:


  • keith
    replied
    Thanks for the additional info.

    The line spacing for logs of the 2 machines with the errors look odd - does that appear in the actual log file or is it because of pastebin?

    The only major change between 5.1 and 6.0 is that while 5.1 reserves all available memory in the system, 6.0 leaves about 1MB left to prevent memory starvation from drivers. However, the error address range (0x3FFFFC98 and 0x3FFFFE94) is not within the unallocated 1MB.

    Can you try setting the upper address limit to 0x100000000 and see if you still get the errors.

    Also, have you tried running the other tests as well?

    Leave a comment:


  • MoKeiichi
    replied
    I have now three identical servers, one doesn't show any errors with either version of Memtest86, the other two always show errors in Test 10 with version 6.0b1 (4 passes), but no errors with version 5.1. But it seems like, the errors are always varying a little bit.
    They always are somewhere between 0x3FFFFC98 and 0x3FFFFE94 (1023MB) and most of them are "Expected: FFFFFFFF, Actual: 00000000" but there are also some like "Expected: 00000000, Actual: 646E6F63" or "Expected: FFFFFFFF, Actual: 46464633"
    See Logs:
    Server 1 (good): http://pastebin.com/5Lpx3x3c
    Server 2 (bad): http://pastebin.com/4zeyksL8
    Server 3 (bad): http://pastebin.com/6fUC5920

    I now wonder if those errors are a) faulty RAM, b) bug in Memtest 6.0b1 or c) buggy UEFI

    I tend to c), because I never managed to install Windows 2012 R2 in UEFI mode on those machines, only in BIOS mode. This OS isn't officialy supported by Intel on them, because those Machines are already EOL, and as said, they report EFI Standard 2.0, which is rather old..

    Leave a comment:


  • keith
    replied
    Thanks for the logs.

    Are the Test 10 errors 100% reproducible? It looks as though some other program (eg. driver) is writing into the memory reserved for testing while the test is idling. This should not happen as the error addresses are within the memory segment reserved exclusively for MemTest86.

    Leave a comment:


  • MoKeiichi
    replied
    Hi,

    first some minor thing: "E(x)it" is missing in config menu (but "x" is still working)

    second problem: Test 10 Bit fade gives errors in 6.0b1 but not in 5.1, Board is Intel MFS2600KI with two Xeon E5-2620 and 8x 8GB Kingston 9965433-095.A00LF

    third problem: No Multi-CPU test, but this is probably because of EFI Standard 2.0, this board doesn't have any newer updates

    Memtest.log: http://pastebin.com/9MpEB89Z

    Click image for larger version

Name:	2LDqaBH.png
Views:	1
Size:	5.1 KB
ID:	34897

    Leave a comment:


  • keith
    replied
    Because MemTest86 v6 beta is unsigned, it may be that you need to disable SecureBoot in the BIOS setup (could be under a setting called 'OS Type' - you need to set to 'Other OS').

    MemTest86 v5.1.0 is signed so it passes the signature check of SecureBoot.

    We'll attempt to get the final V6 release signed by Microsoft (who control the UEFI keys) before its release.

    Leave a comment:


  • jets
    replied
    Hi, I am not able to boot v6.0, tried two different usb devices, both boot fine when using v5.1 but when flashed to v6.0 then UEFI does not load on boot, only legacy.
    Same when manually starting UEFI boot from usb in BIOS - blank schreen for one second and return to BIOS.

    Also no logs in \EFI\BOOT

    Motherboard: Asus F2A85-V
    CPU: AMD A8-5500

    Leave a comment:


  • keith
    replied
    Originally posted by orioon View Post
    What I always wanted to ask in regards to this:
    How does this interact with built-in detection mechanism?
    Usually firmwares are supposed to report (or at least redirect) issues that occurred.

    Will Memtest hook this in a way that only Memtest will report the errors or should they still show up on both ends?
    This depends on the system setup. For example, the firmware may configure the hardware to trigger an interrupt when an ECC error is detected, which is then processed by SMI Interrupt handlers.

    For some chipsets (usually newer), there may be multiple mechanisms for reporting ECC errors. So MemTest86 is able to report these errors independently of the firmware's own error handling. For others, the mechanism may be shared though we have not yet encountered a chipset where there is a contention. It may just be that these systems are unlikely to have UEFI support.

    Originally posted by orioon View Post
    Do you recommend disabling ECC before performing tests with such memory?
    Generally, it is a good idea to leave ECC enabled as it allows for testing of the ECC detection and handling procedure.

    Leave a comment:


  • orioon
    replied
    Originally posted by keith View Post
    Usually, the BIOS setup has an option to disable ECC. However, even with ECC enabled, MemTest86 should be able to report any detected correctable ECC errors.
    What I always wanted to ask in regards to this:
    How does this interact with built-in detection mechanism?
    Usually firmwares are supposed to report (or at least redirect) issues that occurred.

    Will Memtest hook this in a way that only Memtest will report the errors or should they still show up on both ends?
    Do you recommend disabling ECC before performing tests with such memory?

    Leave a comment:


  • pjc
    replied
    It'll say it in the log (look for "ECC polling enabled") at least.

    Leave a comment:


  • dsullaustin
    replied
    How to Determine if ECC Polling is Enabled?

    Based on the previous comment, I have observed ECC memory showing up as ECC on the main test information page. I had assumed that if the test detected ECC memory, then ECC polling was/is the default. If this is not the case, how does one determine if ECC polling is enabled?

    Leave a comment:


  • keith
    replied
    If ECC polling is enabled in MemTest86, ECC errors will be displayed on screen and logged in the log file.

    Leave a comment:


  • pjc
    replied
    Originally posted by keith View Post
    Usually, the BIOS setup has an option to disable ECC. However, even with ECC enabled, MemTest86 should be able to report any detected correctable ECC errors.
    Excellent -- I didn't realize it would already warn me about that. Is such detection fairly conspicuous? All I saw was 0 errors, and no other warnings.

    (I checked, and my BIOS doesn't seem to let me disable ECC.)

    Leave a comment:


  • keith
    replied
    Originally posted by pjc View Post
    Their README mentions that off-the-shelf memory controllers mess with both the bit values and address-to-row mapping for performance reasons, and that they had to use a custom-built memory controller in order to hammer the rows optimally.

    I wonder if that accounts for the difference?
    There are limitations with a purely software approach (ie. without specialized hardware) in that you aren't able to specify the exact memory rank/bank/row to access, or even the exact data to be written in the memory cell. So this makes it difficult to fully implement an extensive hammer test.

    On the other side, this also means that in the real world, systems without the specialized hardware (ie. virtually all systems) would have as much difficulty exposing the memory in that "optimal" way. So a purely software approach is the only way to go.

    Originally posted by pjc View Post
    So is there any way for me to disable ECC so that I can see whether the RAM is susceptible absent correction?
    Usually, the BIOS setup has an option to disable ECC. However, even with ECC enabled, MemTest86 should be able to report any detected correctable ECC errors.

    Leave a comment:

Working...
X