Announcement

Collapse
No announcement yet.

MemTest86 v6.0 Beta (2015-02-13 Update - Beta testing is now closed)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MemTest86 v6.0 Beta (2015-02-13 Update - Beta testing is now closed)

    (Feb 13/2015 Update) MemTest86 v6.0.0 has been officially released. As a result, MemTest86 v6.0 beta testing period is now over. Thank you to our beta testers for the bug reports and feedback.

    Downloads:
    (Feb 13/2015 Update) MemTest86 v6.0.0 Pro Edition can now be purchased from our sales page. As usual, the Free Edition is available for download on the normal MemTest86 download page. Beta downloads are no longer available.

    New Features

    • Support for DDR4 RAM (and associated hardware), including retrieval and reporting of DDR4-specific SPD details
    • Support for Haswell-E (DDR4) ECC detection (untested)
    • New RAM benchmarking feature allowing results to be graphed and saved to disk. Previous results can be graphed on the same chart for comparison.
    • New "Hammer Test" for detecting disturbance errors, which in simple terms, is caused by repeatedly accessing addresses in the same memory bank but different rows in a short period of time. For more details, see this paper by Yoongu Kim

    (Beta 2 changes)
    • Added preliminary language support (Only 'Japanese' is partially available to test Unicode character support. Translation work is ongoing). The language can be specified from the 'Settings' window in the Main Menu
    • Intel XMP 2.0 DDR4 RAM timings are now supported when displaying RAM SPD info

    ​(Beta 3 changes)
    • Added translations and language options for French/German/Japanese/Chinese


    Fixes/Enhancements

    • Fixed ECC detection for Ivy Bridge-EX/Haswell-EX chipsets that have a 2nd memory controller
    • Fixed ECC errors immediately being reported after starting test (Ivy Bridge-E)
    • Updated ImageUSB to v1.1.1015 which includes an option to zero the USB drive
    • Running memory tests in parallel mode is now more robust for UEFI BIOS that exhibit inconsistent multiprocessor behaviour
    • Fixed detection of the number of enabled processors for UEFI BIOS that exhibit inconsistent multiprocessor behaviour
    • Fixed test status screen not being displayed correctly for consoles with small/large screen widths
    • Increased maximum # of supported CPUs to 72
    • Increased maximum # of supported RAM modules to 64
    • In the RAM SPD menu screen, PGUP/PGDN can be used to navigate between pages of RAM modules
    • For specific cases where files under EFI\BOOT cannot be accessed (eg. grub2), log/report files shall be written to the root directory
    • During MemTest86 boot-up, the system memory map is now written to log file
    • CPU speed measurement is now more robust by taking multiple samples
    • Various optimizations of the size of the MemTest86 binary
    • Forced a memory address limit of 32-bits when running under 32-bit UEFI
    • Memory ranges to be tested are now allocated at the beginning of each test (due to the possibility that the memory map changes in the middle of testing)
    • Reduced the number of log messages written when waiting for other processors to finish when running in parallel mode
    • When allocating memory for Bit Fade Test, leave 1MB of free memory available (to prevent firmware drivers from running out of memory)

    ​(Beta 2 changes)
    • Added Xeon E5 v3 ECC support
    • Added Ivy Bridge (non-Xeon) ECC support
    • Added AMD Steppe Eagle ECC support
    • Fixed Intel5400 ECC registers not being reset after starting test
    • Added support for ECC injection for Intel Xeon E3 v3 (untested)
    • Fixed certain Xeon chipsets probing non-existant IMC1 SMBUS
    • Fixed handling of Intel ICH SMBUS built-in hardware semaphore to prevent SMBus device contention
    • Fixed Intel turbo clock speed calculation
    • Fixed possible crash when DDR3 module type value in the RAM SPD info is invalid
    • Fixed DDR4 SPD clock speed rounding errors in the RAM SPD info
    • Fixed DDR3 SPD Register manufacturer/type in the RAM SPD info not appearing correctly
    • New config file parameter 'ECCINJECT' for specifying whether to enable/disable ECC injection
    • New config file parameter 'MEMCACHE' for specifying whether to enable/disable memory caching
    • New config file parameter 'PASS1FULL' for specifying whether the first pass should run the full iteration or reduced iteration
    • New config file parameter 'ADDR2CHBITS' to specify the address bits to XOR to determine the memory channel
    • New config file parameter 'LANG' for specifying language to use on startup
    • Fixed potential crash or other unexpected behaviour due to memory issues with random functions
    • Reports are now saved using UTF16 encoding to support Unicode characters
    • Increased the number of supported memory controllers to 8
    • Changed memory allocation behaviour by only pre-allocating memory segments >= 16MB to prevent memory starvation
    • For Test 13 Hammer Test, only run in parallel mode if the memory segment per CPU is >= 32MB (minimum required to support bits being hammered)
    • Fixed "Hammer Test" text not appearing in test report
    • When mapping memory layout, removed several limits reducing the memory space tested
    • Fixed memory being allocated after memory layout has been mapped (thus changing the memory layout)
    • Fixed memory leak when cleaning up after test completion
    • Fixed memory leak when decoding PNG files
    • Fixed progress bar not displaying 0% on completion of a pass
    • Console resolution is now forced to 80 x 25
    • Graphics resolution is now set to a minimum of 800 x 600
    • Updated to new UEFI SDK libraries (UDK2014)
    • Fixed memtest86v4 incorrectly booting to serial mode by default

    ​(Beta 3 changes)
    • Fixed freeze during initialization of ECC support for Intel E5 v3 (Haswell) due to reading from non-existent MSR register
    • Fixed freeze on systems that use older UEFI firmware (such as Mac) that do not support string packages and fonts using the Hii Database. These systems may have limited language support, however.
    • Changed minimum resolution to 1024 x 768
    • Fixed Main Menu text being overlapped on some systems
    • Various system info related fixes (SPD, CPU)


    How to report problems

    Either make a post here in the forum, or send us an email at the address listed on our contact page. When reporting an error please provide as much details as possible. If you are running on a USB drive, there should be a log file that has been generated in the 'EFI/BOOT' directory called MemTest86.log. Sending us this will be of great help. Additionally a photograph of the problem would also be useful if possible/applicable.

    Screenshots

    Click image for larger version

Name:	MemTest86v6.0-DDR4-SPD.jpg
Views:	1
Size:	44.1 KB
ID:	35157
    DDR4 SPD

    Click image for larger version

Name:	MemTest86v6.0-benchmark-graph.jpg
Views:	1
Size:	43.3 KB
ID:	35155
    Benchmark results graphing

    Click image for larger version

Name:	MemTest86v6.0-Hammer-Test.jpg
Views:	1
Size:	102.1 KB
ID:	35156
    Hammer Test
    Last edited by keith; Feb-13-2015, 11:23 PM.

  • #2
    Amazingly enough, I got 0 errors in the row hammer test. From the presentation you cited, that seems like it would put me in the enviable 15% of RAM that's not affected.

    The actual paper also indicates that ECC (at least "simple" ECC) can't correct all errors. Specifically, it can't correct multi-bit errors, and it can't even necessarily detect errors when 3 or more bits in a 64-bit word are flipped.

    Is there any way to tell whether I passed row hammer because I only got single-bit errors and ECC fixed them, or whether I actually got no errors? (I'm guessing that it's unlikely that I got all 3-bit errors that were missed by ECC.)

    Do you know if the access pattern used in their sample implementation (being suboptimal but fast) is unlikely to trigger multi-bit errors? The sample image you provided shows all single-bit errors, which leads me to suspect ECC might have silently fixed errors on my system.

    Comment


    • #3
      We think the estimate in the paper was a overestimate (maybe to get publicity?).
      We think that maybe only 5% - 20% of RAM is effected. But we haven't run it on enough different machines as yet to have an accurate percentage.

      Clearly it isn't the case that 85% of the RAM in the world is bad.

      In the cases where we did see an error, they were often single bit errors. So this also contradicts the rather alarmist paper.

      Comment


      • #4
        Originally posted by David (PassMark) View Post
        We think the estimate in the paper was a overestimate...
        We think that maybe only 5% - 20% of RAM is effected. But we haven't run it on enough different machines as yet to have an accurate percentage.
        Their README mentions that off-the-shelf memory controllers mess with both the bit values and address-to-row mapping for performance reasons, and that they had to use a custom-built memory controller in order to hammer the rows optimally.

        I wonder if that accounts for the difference?

        In the cases where we did see an error, they were often single bit errors.
        And those would get corrected by ECC.

        So is there any way for me to disable ECC so that I can see whether the RAM is susceptible absent correction?

        Comment


        • #5
          Originally posted by pjc View Post
          Their README mentions that off-the-shelf memory controllers mess with both the bit values and address-to-row mapping for performance reasons, and that they had to use a custom-built memory controller in order to hammer the rows optimally.

          I wonder if that accounts for the difference?
          There are limitations with a purely software approach (ie. without specialized hardware) in that you aren't able to specify the exact memory rank/bank/row to access, or even the exact data to be written in the memory cell. So this makes it difficult to fully implement an extensive hammer test.

          On the other side, this also means that in the real world, systems without the specialized hardware (ie. virtually all systems) would have as much difficulty exposing the memory in that "optimal" way. So a purely software approach is the only way to go.

          Originally posted by pjc View Post
          So is there any way for me to disable ECC so that I can see whether the RAM is susceptible absent correction?
          Usually, the BIOS setup has an option to disable ECC. However, even with ECC enabled, MemTest86 should be able to report any detected correctable ECC errors.

          Comment


          • #6
            Originally posted by keith View Post
            Usually, the BIOS setup has an option to disable ECC. However, even with ECC enabled, MemTest86 should be able to report any detected correctable ECC errors.
            Excellent -- I didn't realize it would already warn me about that. Is such detection fairly conspicuous? All I saw was 0 errors, and no other warnings.

            (I checked, and my BIOS doesn't seem to let me disable ECC.)

            Comment


            • #7
              If ECC polling is enabled in MemTest86, ECC errors will be displayed on screen and logged in the log file.

              Comment


              • #8
                How to Determine if ECC Polling is Enabled?

                Based on the previous comment, I have observed ECC memory showing up as ECC on the main test information page. I had assumed that if the test detected ECC memory, then ECC polling was/is the default. If this is not the case, how does one determine if ECC polling is enabled?

                Comment


                • #9
                  It'll say it in the log (look for "ECC polling enabled") at least.

                  Comment


                  • #10
                    Originally posted by keith View Post
                    Usually, the BIOS setup has an option to disable ECC. However, even with ECC enabled, MemTest86 should be able to report any detected correctable ECC errors.
                    What I always wanted to ask in regards to this:
                    How does this interact with built-in detection mechanism?
                    Usually firmwares are supposed to report (or at least redirect) issues that occurred.

                    Will Memtest hook this in a way that only Memtest will report the errors or should they still show up on both ends?
                    Do you recommend disabling ECC before performing tests with such memory?

                    Comment


                    • #11
                      Originally posted by orioon View Post
                      What I always wanted to ask in regards to this:
                      How does this interact with built-in detection mechanism?
                      Usually firmwares are supposed to report (or at least redirect) issues that occurred.

                      Will Memtest hook this in a way that only Memtest will report the errors or should they still show up on both ends?
                      This depends on the system setup. For example, the firmware may configure the hardware to trigger an interrupt when an ECC error is detected, which is then processed by SMI Interrupt handlers.

                      For some chipsets (usually newer), there may be multiple mechanisms for reporting ECC errors. So MemTest86 is able to report these errors independently of the firmware's own error handling. For others, the mechanism may be shared though we have not yet encountered a chipset where there is a contention. It may just be that these systems are unlikely to have UEFI support.

                      Originally posted by orioon View Post
                      Do you recommend disabling ECC before performing tests with such memory?
                      Generally, it is a good idea to leave ECC enabled as it allows for testing of the ECC detection and handling procedure.

                      Comment


                      • #12
                        Hi, I am not able to boot v6.0, tried two different usb devices, both boot fine when using v5.1 but when flashed to v6.0 then UEFI does not load on boot, only legacy.
                        Same when manually starting UEFI boot from usb in BIOS - blank schreen for one second and return to BIOS.

                        Also no logs in \EFI\BOOT

                        Motherboard: Asus F2A85-V
                        CPU: AMD A8-5500

                        Comment


                        • #13
                          Because MemTest86 v6 beta is unsigned, it may be that you need to disable SecureBoot in the BIOS setup (could be under a setting called 'OS Type' - you need to set to 'Other OS').

                          MemTest86 v5.1.0 is signed so it passes the signature check of SecureBoot.

                          We'll attempt to get the final V6 release signed by Microsoft (who control the UEFI keys) before its release.

                          Comment


                          • #14
                            Hi,

                            first some minor thing: "E(x)it" is missing in config menu (but "x" is still working)

                            second problem: Test 10 Bit fade gives errors in 6.0b1 but not in 5.1, Board is Intel MFS2600KI with two Xeon E5-2620 and 8x 8GB Kingston 9965433-095.A00LF

                            third problem: No Multi-CPU test, but this is probably because of EFI Standard 2.0, this board doesn't have any newer updates

                            Memtest.log: http://pastebin.com/9MpEB89Z

                            Click image for larger version

Name:	2LDqaBH.png
Views:	1
Size:	5.1 KB
ID:	34897

                            Comment


                            • #15
                              Thanks for the logs.

                              Are the Test 10 errors 100% reproducible? It looks as though some other program (eg. driver) is writing into the memory reserved for testing while the test is idling. This should not happen as the error addresses are within the memory segment reserved exclusively for MemTest86.

                              Comment

                              Working...
                              X