Announcement

Collapse
No announcement yet.

errors (block move) only with manual selection

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • errors (block move) only with manual selection

    I've been testing one computer for a couple of days now - the main components have been fine for 3 years (hard drives failed on this machine before, none recently though)

    I ran a 4 separate pairs of DDR3 1600 with memtest86 overnight with many errors (occasionally the first 2 or 3 cycles through showed no errors)

    With memtest86 version 4.3 and one core setup they ran with no errors

    Then I tried manually picking the one test that had failed, block move (keys c, 1, 3, 6, enter, 0) and immediately upon the test's start it shows errors. Doing this multiple times occasionally freezes the system.

    Error confidence is always >100 and errors show up at different locations ("lowest error address" has been as low as 3M and as high as 2100MB)

    summarize: the entire suite run contiguously 8 to 12 hours produces no errors, picking only "block move" immediately produces errors.

    Am I right to conclude the memory may be just fine (2 manufacturers, g.skill and corsair, and 8 sticks check out fine then produce the same errors, different locations) and either the CPU or motherboard (bios maybe?) is bad?


    I'm running with no peripherals - hard drives, network cards, nothing, though the motherboard has built in audio, video & network

    system: (no overclocking, BIOS defaults)
    AMD Phenom II X4 945 3 gigahertz
    motherboard: asrock m3A790gxh/128M

    memory (all 9-9-9-24)
    corsair cml16gx3m4a1600c9 (4x4GB set, tested 4@once and 2 separately) (this is the one that ran for years with no problems)

    gskill f312800cl9Q-16GBZL (4x4GB set, tested 4@once and 2 separately)

    edit: I forgot to ask the actual question that prompted me to register: is there anything uniquely or especially stressful about starting the "block move" test all by itself, out of sequence? This same test ran with zero failures on this same memory, and fails almost every time when I trigger it out-of-order.
    Last edited by Sanjeev K Sharma; Aug-05-2013, 08:57 PM.

  • #2
    With memtest86 version 4.3 and one core setup they ran with no errors
    Then I tried manually picking the one test that had failed
    Not sure if i understand this part of the post.
    You ran all the tests and had no test failure, but then picked the test that failed, in order to run it by itself? If there was no test that failed, how did you pick the one that failed?

    The block move test has given some grief in the past. But only when run on multithreaded mode on some motherboards. We aren't aware of any issues in single threaded mode.

    You should get the same result regardless of the test sequence. (Assuming all environmental factors are equal, e.g. temperature). But of course if test #6 block move is generating errors, you'll get the errors at a higher rate, by just running this single test in a loop.

    So your results are a bit inconclusive. Could be a bug in MemTest86, could still be a hardware failure (e.g. Motherboard or CPU), could even be a design flaw or errata in the CPU/Motherboard.

    I don't suppose you have a 2nd machine to test the RAM in?

    Comment


    • #3
      thanks for the reply.

      it was the test that failed in multi core mode, both with memtest86 v4.2 and v4.3. My Ubuntu install had v4.2 as a boot option, and I had not known of the threading issues until several overnight tests, all with many block move failures.

      > I don't suppose you have a 2nd machine to test the RAM in?

      not yet. I've emailed contacts who may have one (and may know that they have one)

      but an update: this machine failed mprime after my first comment - I ran the test that allegedly stays mostly on CPU cache.
      Last edited by Sanjeev K Sharma; Aug-06-2013, 05:09 AM.

      Comment


      • #4
        Originally posted by Sanjeev K Sharma View Post
        thanks for the reply.
        but an update: this machine failed mprime after my first comment
        Interesting. Let us know the outcome.

        In V4.3 we default to single threaded mode. Partly due to problems in this test. We never got to the bottom of the problem however. Either about 1% - 5% of machines in the market have a latent hardware fault or design flaw, or there is a bug in MemTest86 V4.x that we can't find in the time available. We have since moved on to development on V5 instead.

        If the machine failed mprime as well, then this tilts the balance a bit towards a widespread hardware issue.

        Comment


        • #5
          We did some further testing on V4.3.

          There is a bug in Test #6 - block move. This test moves blocks of memory around and then verifies the values in memory are correct after the moves, by doing a compare with adjacent blocks of RAM.
          It seems like there is a bug the boundary checking, meaning that the test is reading areas of RAM it shouldn't be. The byte values in these areas are either random or set by tests that had run earlier. So running the tests in different orders can result in different behaviour.

          As best we can tell this bug has been in the code for a while (maybe forever).

          There will be a fix available in the next V4.x patch release.
          The problem doesn't effect the V5 release in UEFI mode.

          Comment


          • #6
            I've been reading past posts for more technical info & found your update here is - thanks for the follow up.

            You ran more tests despite thinking you already had an explanation. Impressive.

            Comment

            Working...
            X