Announcement

Collapse
No announcement yet.

Strange result - cured RowHammer by re-arranging SO-DIMMs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Strange result - cured RowHammer by re-arranging SO-DIMMs

    I'm prepping an iMac (27-inch Mid-2010) to give to someone, and have been testing RAM to make sure I only scrounged up good stuff. I have four 4GB SO-DIMMS, two for Hynix and two from Samsung, and with all of them installed I got a whole lot of RowHammer errors. When I RE-ARRANGED THEM WITHOUT REMOVING OR ADDING ANY (swapping pairs between banks and also swapping each pair side-to-side within the bank) all the errors went away.

    The user I promised this machine to has waited patiently (so far) but I'd like to finish this. On the other hand I don't understand this result and don't know if I can trust this RAM. Why would moving DIMMs around cure RowHammer errors?

  • #2
    Ideally, the row hammer test should be able to hammer row pairs within the same memory bank in order to expose errors in the RAM. However, this requires knowledge of how a memory address (eg. 0x10408000) is mapped to physical DRAM rank/bank/row/column addresses, which is dependant on the particular chipset and how the RAM is slotted in the motherboard. More often than not, the decoding scheme is not revealed publicly by the chipset vendor.

    Nevertheless, MemTest86 attempts to hammer row pairs indiscriminately based on possible memory address offsets determined from experimentation. This method is able to reveal the most obvious errors, but results may vary depending on the chipset, the RAM itself, and how the RAM is arranged on the motherboard (which changes how a memory address is decoded to the individual DIMMs). So in your case, moving the DIMMs around may just be masking the issue as MemTest86 may not be as efficiently hitting row pairs within the same bank as it would in the previous arrangement.

    Comment


    • #3
      100% agree - the issue is still there.
      It is more than likely due to the Rank/Bank/Dimm/Slot address interleaving that memtest86 is not able to detect the failure - possibly because that particular memory is not available to memtest86.
      (It may be allocated to BIOS, TSEG, GFX, etc)
      If you really care - if there is a BIOS option called '2x refresh' or 'Row Hammer mitigation', the problem may subside as it becomes statistically very difficult to hit.

      Comment

      Working...
      X