Announcement

Collapse
No announcement yet.

ASRock X99 with Xeon E5-2670 v3: ECC working or not?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ASRock X99 with Xeon E5-2670 v3: ECC working or not?

    Plenty of X99 based motherboards will say that they support ECC DDR4 dimms. But the typical Hollywood style BIOS they put on consumer oriented motherboards these days will let you control every tiny detail of DRAM timing, but do not permit any control over ECC settings.

    And tools like HWinfo, CPU-Z report ECC support not being available on the system.

    I was getting my blood pressure up, because I had just spent 50% more on ECC DIMMs and really like to be on the safe side with a system that runs 24 threads on 128GB of memory, when I decided to let Row-Hammer have a go on it.

    And, behold, memtest86 reports ECC being available and working, even if ECC error injection doesn't seem to be available (available it is at least on an other Xeon E3-1276/C216 machine I have).

    So does memtest86 enable ECC looking at the capabilities of DIMMs and the CPU fiddling with the MSRs or similar or does it simply report what it finds available and working?

    In the first case, it would just add to my frustration that ECC support, even if it is phyiscally there, is "disappared" to satisfy market segmentation wishes of Intel, the ROM makers etc.

    In the second case, I could live happily with the fact that I can't configure just how ECC is supposed to work, scrubbing intervals and suchlike which I've always left in AUTO to begin with.

    If ECC injection was available and proving to work properly, I wouldn't have to ask this question

  • #2
    Followup question: If I understand correctly, all non-ECC boards with DDR3/DDR4 should fail Rowhammer, right? Which isn't really a stability issue in most cases, because ordinary software won't have these access patterns, right?

    So if the Xeon E5-2670 on the X99 chipset with Crucial ECC DDR4 passes Rowhammer without even a warning (but perhaps a hint that some ECC work was actually done), that should be a strong indication of ECC being active and working normally (unless enabled by memtest86 itself)?

    Comment


    • #3
      MemTest86 simply reports whether ECC error detection is supported and/or enabled; it does not perform any enabling of ECC detection if it's disabled.

      Not all systems are vulnerable to row hammer errors as it depends on the RAM itself as well as the chipset's memory controller logic. Using ECC memory does provide some protection from 1-bit row hammer errors, but may not protect from all multi-bit row hammer errors. Any row hammer errors corrected by ECC memory would be detected and displayed by MemTest86 on screen.

      Comment


      • #4
        Originally posted by keith View Post
        MemTest86 simply reports whether ECC error detection is supported and/or enabled; it does not perform any enabling of ECC detection if it's disabled.

        Not all systems are vulnerable to row hammer errors as it depends on the RAM itself as well as the chipset's memory controller logic. Using ECC memory does provide some protection from 1-bit row hammer errors, but may not protect from all multi-bit row hammer errors. Any row hammer errors corrected by ECC memory would be detected and displayed by MemTest86 on screen.
        Thank you for that clarification: Perhaps the principle that MemTest86 does not at all modify any memory configuration registers (apart fromt he error injection ones, obviously) but limits itself on reporting what it finds, could be highlighted in bold some early in the documentation.

        With error injection either disabled (my Xeon E5 v3) or locked (my Xeon E3 v3) I had hoped that Row-Hammer would allow me to essentially do error injection via Row-Hammer.

        I remember quite vividly my shock, when Test 13 produced massive amounts of errors some time ago on a brand new Kaveri system I just wanted to verify at DDR3-2400 DRAM speeds: At that time I had never heard about Row-Hammer and just needed the fastest RAM money could buy to maximize the APUs graphics performance (still wanted it to work correctly). I relaxed a bit after reading what Test 13 does (still keep wishing AMD APUs would support ECC in the socket variants, because I run these as low-power servers).

        At the time, ECC was still assumed to protect against row-hammer, but now we know that it's just a matter of how many bits are flipped in the process.

        Skimming the forum on row hammer feedback, I had the fleeting impression, that the aggressiveness of Row-Hammer could be modulated somewhere between "sure to flip lots of bits" and "potentially flipping a bit here and there unless you increase your refresh".

        And while nobody would want to use the most aggressive mode to test the RAM chips themselves, that could potentially be used to test if ECC logic is active: It should produce uncorrectable ECC errors when RAM is basically thrashed while it might show corrected ECC errors in a softer setting.

        In other words I was thinking that Row-Hammer could be used as some kind of back-door error injection method to check on ECC being active and working.

        Comment

        Working...
        X