Announcement

Collapse
No announcement yet.

False positive with test 10

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • False positive with test 10

    Hi,

    Thanks for your software which is very useful.
    I am running memtest86 7.5 on a blade server on a newly purchased RAM to check it.
    The test 10 is always failing at the same position.

    Hardware details:
    Motherboard: S2600WP (chipset C602) https://ark.intel.com/products/61390...-Board-S2600WP
    CPU: 2 x Xeon E5-2650v2
    RAM: 256Gb : 16x16GB PC3-12800 DDR3 Samsung Reg ECC

    Test 10 error (always 512 or 1024 errors per pass between 0x3FFFE018 and 0x3FFFF814):





    Switching RAM position doesn't change the result.
    Using another blade which has the same configuration except the RAM (128Gb 16x8Gb) also give the same result:



    I checked the memory usage and the failing position seems to be officially free.

    Last comment is that there is no error with memtest86 version 4 with the test 10.
    All other tests had no errors on both blades (4 passes).

    Please let me know your comments on this errors.

    Additional question: The memory speed is two times faster on the 256GB blade compare to the 128GB one, both RAM are operating at 1333MHz. Is it normal ? As a consequence it takes the same time to test 256GB than 128GB, around 24 hours.

    Best regards,
    Matthieu

  • #2
    Very likely it is due to a bug in the UEFI BIOS on the machine. The memory map is wrong.
    (MemTest V4 uses old style BIOS, and a different memory map)

    This small block of addresses has been flagged as being free RAM when in reality it isn't.
    There will be some other hardware device writing values into that memory block (the network card, video card, something.....).

    Most of the tests in MemTest86 execute very quickly. Meaning there is a very small window in time between writing the data and reading it back from any particaular memory address. So any external hardware writing to the same addresses is unlikely to cause corruption. But for test 10, there is a long delay between the writing and the reading steps. So corruption from external writing can then be noticed.

    There is a small chance a BIOS firmware upgrade might fix it. Otherwise report the problem to Intel for a fix (if they care, which is unlikely).

    Comment


    • #3
      Thanks David,

      I am already running the latest Bios, since it's an old motherboard I don't think they will fix it.

      Do you have an idea about the question regarding memory speed?
      Additional question: The memory speed is two times faster on the 256GB blade compare to the 128GB one, both RAM are operating at 1333MHz. Is it normal ? As a consequence it takes the same time to test 256GB than 128GB, around 24 hours.

      L1, L2 and L3 speed is also impacted (visible in screenshots). I wonder if the speed difference is only impacting memtest86 or the OS as well.

      Comment


      • #4
        Maybe with more RAM sticks you make better use of the quad memory channels in the CPU (would depend on the slots you used). There might also be NUMA effects with dual CPUs.
        Doesn't really explain the change the L1 cache speeds however.

        Comment


        • #5
          In both cases all slots are used (16) and the same model of RAM is used on all slots. So quad channel should be active.
          I don't know if there is any additional test to perform to pinpoint the issue.

          Comment

          Working...
          X