Announcement

Collapse
No announcement yet.

multi CPU test fails / single CPU mode passes

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • David (PassMark)
    replied
    We implemented a work-around in the V4.3 release that should allow testing on machines that have this problem. The memory testing range bug was also fixed in this release.

    See the post on the MemTest86 V4.3 release for more details.

    Leave a comment:


  • David (PassMark)
    replied
    We fixed the problem testing memory ranges, but the fix isn't public as yet.

    The initial problem you had it probably the same as this problem.
    http://www.passmark.com/forum/showth...ails-at-Test-3

    Since your initial post we purchased a new test machine with same specs as one that exhibited the problem (a Xeon E3). We were able to reproduce the problem with Test #3 on the new machine, but the root cause of the problem is elusive. Basically when multiple threads are running the CPU registers get corrupted. They appear to spontaneously become corrupted during the test. Causing a flood of errors. We don't know if this is a CPU errata, or a bug in the way multi-threading is setup or something more subtle.

    Leave a comment:


  • xerces8
    replied
    Any news on the front?
    (I was away for some days, if I did not answer any question, please repeat it)

    Leave a comment:


  • xerces8
    replied
    Slovenia, main PC.

    What kind of access are you talking about?

    Leave a comment:


  • David (PassMark)
    replied
    ...something weird happened. all 10 tests execute in about one second.
    Looks like this is a bug. Unrelated to the initial problem you have, but a bug nonetheless. We can see similar strange behaviour when we set a high limit for the starting memory address. As we can reproduce this here, it should be fixable quickly.

    What country are you in? Is this your main PC? I am just wondering how easy it would be to get access to your machine for deeper testing?

    Leave a comment:


  • xerces8
    replied
    I set the lower address limit to 2g and something weird happened.
    all 10 tests execute in about one second. See screenshot:

    Leave a comment:


  • David (PassMark)
    replied
    OK, here is another test to try.

    Can you go into the configuration window in MemTest86 and select to test just the top 2GB of RAM. ie. no testing on 0GB to 2GB.

    Leave a comment:


  • xerces8
    replied
    I run memtest 4.3.0beta in VMWare, just to see what happens. All tests run without detecting any errors (I run them only for a few minutes). I configured the VM to have 4 CPUs, so the multithreaded tests would execute as on real hardware.

    Leave a comment:


  • xerces8
    replied
    1.) I run Windows 8 64 bit. Before I had 7 and XP. No problems in daily usage.
    2.) CPU test run OK, temp up to 68 C. No problems noticed.
    RAM test passed OK, nothing special observed.
    3.) I started 2 instances of the 7-zip built-in benchmark with 4 threads each and according to CoreTemp the CPU temperature climbed to 60-66 degrees Celsius in a minute and stayed there for the duration of the test.
    4.) No BIOS updates are available, at least no official ones. (and I also don't know about any unofficial ones either)

    Leave a comment:


  • David (PassMark)
    replied
    Given that is happening across most of the tests and not just test #3, it make it less likely it is a coding bug in MemTest86. I guess there is a slim chance there is a bug in the multi-threading code, but someone else surely would have encounter this problem beforehand I think.

    We went through the source code of Test #3 and couldn't find any fault in the code that would produce the effect you saw.

    So I think it is worth trying to see if this is a real hardware fault. Maybe not with the RAM, but with the BIOS, MB or CPU. For example

    1) How stable is this machine in Windows in normal use (I assume you are running Win7 64bit?)

    2) Can you download the trial of BurnInTest for Windows and select just the CPU test to run, then select just the RAM test (both at 100% load). Is the machine stable under this load?

    3) What the CPU temperatures like when under heavy load?

    4) Are there any BIOS updates available for the machine?

    Leave a comment:


  • xerces8
    replied
    Also I noticed that while multi CPU test are running, the keyboard is unresponsive.

    Leave a comment:


  • xerces8
    replied
    I did some more tests, running each test separately:
    - start Memtest86 v4.3.0 beta in default mode
    - press c 1 3 ... to select a single test

    Results:
    - single CPU tests run for hours without errors (tests 0, 1, and 10)
    - all multi CPU tests fail in the first second (and keep counting errors fast), except tests 6 and 2, which:
    - test 6 did not detect any error in 20 minutes of run time, then I stopped it
    - test 2 ran 5 minutes without errors, then started reporting a lot of them

    Here are screenshots of each test:

    http://imgur.com/a/DGwwO

    Leave a comment:


  • David (PassMark)
    replied
    Thanks for that.
    We'll do some investigation.

    I did in fact try to turn on permissions to post images in the forum. I didn't work, for reasons unknown and I haven't had time to debug the forum software to work out why as yet.

    Leave a comment:


  • xerces8
    replied
    The multi CPU problem happens with all RAM modules.

    I just ran memtest86+ v4.2 the whole day (15 passes) with the 2x2GB RAM and no error detected. I used these for a year and run memtest a couple of times in past, never detected any errors.


    Here is the screenshot from todays run:


    Screenshot of multi cpu error:



    Screenshot of same in Individual error report mode and scroll locked:


    And as a bonus, this happened on one run of 4.3.0 beta in default mode, a few seconds after it started printing memory errors:


    PS: I don't see the attachment manager anywhere??? I'll try to add the pictures in a minute.
    PPS: I guess this is one of those "no file uploads" forums...
    Last edited by xerces8; 06-13-2013, 11:20 PM.

    Leave a comment:


  • David (PassMark)
    replied
    The multi CPU error frenzy happens with them to
    Which set is 'them'? It isn't clear.

    It is totally possible that you have 1 byte (from 4 billion bytes) that is defective and you haven't noticed up until now.

    Leave a comment:

Working...
X