Announcement

Collapse
No announcement yet.

MemTest86 v6.0 Beta (2015-02-13 Update - Beta testing is now closed)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I have now three identical servers, one doesn't show any errors with either version of Memtest86, the other two always show errors in Test 10 with version 6.0b1 (4 passes), but no errors with version 5.1. But it seems like, the errors are always varying a little bit.
    They always are somewhere between 0x3FFFFC98 and 0x3FFFFE94 (1023MB) and most of them are "Expected: FFFFFFFF, Actual: 00000000" but there are also some like "Expected: 00000000, Actual: 646E6F63" or "Expected: FFFFFFFF, Actual: 46464633"
    See Logs:
    Server 1 (good): http://pastebin.com/5Lpx3x3c
    Server 2 (bad): http://pastebin.com/4zeyksL8
    Server 3 (bad): http://pastebin.com/6fUC5920

    I now wonder if those errors are a) faulty RAM, b) bug in Memtest 6.0b1 or c) buggy UEFI

    I tend to c), because I never managed to install Windows 2012 R2 in UEFI mode on those machines, only in BIOS mode. This OS isn't officialy supported by Intel on them, because those Machines are already EOL, and as said, they report EFI Standard 2.0, which is rather old..

    Comment


    • #17
      Thanks for the additional info.

      The line spacing for logs of the 2 machines with the errors look odd - does that appear in the actual log file or is it because of pastebin?

      The only major change between 5.1 and 6.0 is that while 5.1 reserves all available memory in the system, 6.0 leaves about 1MB left to prevent memory starvation from drivers. However, the error address range (0x3FFFFC98 and 0x3FFFFE94) is not within the unallocated 1MB.

      Can you try setting the upper address limit to 0x100000000 and see if you still get the errors.

      Also, have you tried running the other tests as well?

      Comment


      • #18
        All the whitespace is in the original file. I've also run the other tests, except for hammering, which I aborted after 20h. As this machine is now prepared for production use, I cannot make any more tests on this machine, but I will probably have another one in a few weeks.

        Comment


        • #19
          I'm running 6.0 Beta on an FBDIMM based Mac Pro and always get a correctable ECC error during test 0. I've tried about 10 different flavors of memory (Hynix, Samsung, Micron, different densities, etc) and they all show this same initial ECC error. All of the modules will complete the remaining tests (didn't try hammer) without issue. So, this may be a false failure. See attached.Click image for larger version

Name:	IMG_3635.jpg
Views:	1
Size:	44.5 KB
ID:	34898

          Comment


          • #20
            It may be that the ECC registers were not reset at the beginning of the test. Can you upload or send us a copy of the MemTest86.log file under EFi/BOOT/.

            Comment


            • #21
              Originally posted by keith View Post
              It may be that the ECC registers were not reset at the beginning of the test. Can you upload or send us a copy of the MemTest86.log file under EFi/BOOT/.
              Log file sent.

              Comment


              • #22
                MemTest86 6.0 Beta 2 release

                We are pleased to announce MemTest86 6.0 Beta 2 is now available for download from the MemTest86 download page.

                Changes since Beta 1 are as follows.

                New Features

                • Added preliminary language support (Only 'Japanese' is partially available to test Unicode character support. Translation work is ongoing). The language can be specified from the 'Settings' window in the Main Menu
                • Intel XMP 2.0 DDR4 RAM timings are now supported when displaying RAM SPD info


                Fixes/Enhancements

                • Added Xeon E5 v3 ECC support
                • Added Ivy Bridge (non-Xeon) ECC support
                • Added AMD Steppe Eagle ECC support
                • Fixed Intel5400 ECC registers not being reset after starting test
                • Added support for ECC injection for Intel Xeon E3 v3 (untested)
                • Fixed certain Xeon chipsets probing non-existant IMC1 SMBUS
                • Fixed handling of Intel ICH SMBUS built-in hardware semaphore to prevent SMBus device contention
                • Fixed Intel turbo clock speed calculation
                • Fixed possible crash when DDR3 module type value in the RAM SPD info is invalid
                • Fixed DDR4 SPD clock speed rounding errors in the RAM SPD info
                • Fixed DDR3 SPD Register manufacturer/type in the RAM SPD info not appearing correctly
                • New config file parameter 'ECCINJECT' for specifying whether to enable/disable ECC injection
                • New config file parameter 'MEMCACHE' for specifying whether to enable/disable memory caching
                • New config file parameter 'PASS1FULL' for specifying whether the first pass should run the full iteration or reduced iteration
                • New config file parameter 'ADDR2CHBITS' to specify the address bits to XOR to determine the memory channel
                • New config file parameter 'LANG' for specifying language to use on startup
                • Fixed potential crash or other unexpected behaviour due to memory issues with random functions
                • Reports are now saved using UTF16 encoding to support Unicode characters
                • Increased the number of supported memory controllers to 8
                • Changed memory allocation behaviour by only pre-allocating memory segments >= 16MB to prevent memory starvation
                • For Test 13 Hammer Test, only run in parallel mode if the memory segment per CPU is >= 32MB (minimum required to support bits being hammered)
                • Fixed "Hammer Test" text not appearing in test report
                • When mapping memory layout, removed several limits reducing the memory space tested
                • Fixed memory being allocated after memory layout has been mapped (thus changing the memory layout)
                • Fixed memory leak when cleaning up after test completion
                • Fixed memory leak when decoding PNG files
                • Fixed progress bar not displaying 0% on completion of a pass
                • Console resolution is now forced to 80 x 25
                • Graphics resolution is now set to a minimum of 800 x 600
                • Updated to new UEFI SDK libraries (UDK2014)
                • Fixed memtest86v4 incorrectly booting to serial mode by default

                Comment


                • #23
                  Sorry for delay but yes, SecureBoot was exactly the problem. Dunno how i missed that, tried almost all other legacy etc. options.

                  Thanks.

                  Comment


                  • #24
                    That sounds awesome, thanks a lot!
                    Will look into this when I get some time.

                    Do you need some help with the translations? I could help you out with the German one.

                    Would it be possible to explain 'ADDR2CHBITS' a bit?

                    Comment


                    • #25
                      Originally posted by orioon View Post

                      Do you need some help with the translations? I could help you out with the German one.

                      That would be greatly appreciated if you could help. If you are interested, please send us an email at help [at] passmark [dot] com.

                      Comment


                      • #26
                        The 'ADDR2CHBITS' parameter defines a list of bit positions of a memory address to exclusive-or (XOR) to determine which memory channel (0 or 1) is used. This is useful if you know that the memory controller maps a particular address to a channel using this decoding scheme. If this parameter is specified and MemTest86 detects a memory error, the channel number will be calculated and displayed along with the faulting address.

                        For example,

                        ADDR2CHBITS=1,8,9

                        will XOR bits 1,8,9 of the address to determine the channel.

                        Memory address 0xA00001D2 will map to channel 0 (Bits 1,8 are set)

                        Memory address 0xA00003D2 will map to channel 1 (Bits 1,8,9 are set)

                        This is all basically a mechanism to tell you which slot the faulty RAM stick is in. Thus avoiding the typical trial and error testing when you are testing a machine with multiple RAM sticks. So it can be a pretty useful time saving feature. Especially if you are testing a large number of identical machines, where you can identify which slot on the circuit board is Channel 0 or 1.

                        The obvious question is why isn't this being set automatically? The answer is that there is no industry standard for how RAM is mapped to addresses. Often there is interleaving of addresses between the channels and how this is done isn't always documented for particular CPUs / chipsets.

                        So this ends up being a feature that is useful only for (really) advanced users, and typically just those doing volume manufacturing who know the deep internals of their systems.

                        Comment


                        • #27
                          It doesn't sound that useful for normal desktops and laptops which typically only have 2 memory channels.
                          Three channels were once used, but Intel switched back.
                          Someone knows if they going for more channels again in the next generation Desktop CPUs?

                          In enterprise environments, where I can have up to 16 channels (4 Socket System) this is just what I needed.
                          However 8 channels are likely to be much more common in servers and workstations with 2 sockets, but this doesn't hurt the usefulness at all.
                          Typically there is just one DIMM per channel installed so I should be able to determine the faulty stick without trial and error in the majority of cases.
                          Of course this requires that I tested and documented the channel configuration before.
                          Looks like I need to search some faulty sticks which I didn't get rid off yet

                          Is this limited to the Pro edition? Probably it is because it is a config file parameter.
                          That would be a huge argument for businesses and very enthusiastic users to purchase it in my opinion and is definitely worth it.
                          Thank you so much for this!
                          Last edited by orioon; Dec-04-2014, 07:21 PM.

                          Comment


                          • #28
                            Hi,

                            beta2 has some display issues on a Dell M710 blade which I tested.
                            The space between lines in the menu is too small, which causes them to overlap.
                            (Can take a screenshot if helpful, got none right now)
                            Once I start a test, everything looks fine.

                            Memtest was also not able to detect the memory error, I know for a fact that the module in DIMM B1 is defect (ECC errors are being logged in the BIOS).
                            (Canceled after 24hours, it was not able to complete 2 passes due to the lack of MP support in UEFI)

                            Firmware is very outdated, will soon update it, maybe that will fix it already.
                            The log (sent by mail) also contains some errors regarding the GetGlyph command.

                            Comment


                            • #29
                              Hi. I have two systems that hang at "Getting memory controller details" with the new Beta 2. One is an Asus X99-A (i7-5820k) and the other is an Asus X99-Deluxe (i7-5960X). The USB stops blinking and the keyboard is still responsive. Beta 1 didn't have this issue.

                              Comment


                              • #30
                                Originally posted by nobody101 View Post
                                Hi. I have two systems that hang at "Getting memory controller details" with the new Beta 2. One is an Asus X99-A (i7-5820k) and the other is an Asus X99-Deluxe (i7-5960X). The USB stops blinking and the keyboard is still responsive. Beta 1 didn't have this issue.
                                We might be able to provide a fix for it. Can you email us the MemTest86.log file under EFI\BOOT.

                                Comment

                                Working...
                                X