Announcement

Collapse
No announcement yet.

External drive causing errors?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • External drive causing errors?

    This is odd. I successfully set up Memtest on a CD to check the 32GB of RAM in my MacPro4,1 (OSX 10.8.3) per Beth's instructions in an earlier thread (thank you!). Running it, I instantly got errors, so I started pulling out RAM and re-testing. In a nutshell, it appears my memory wasn't the issue, it was an external RAID drive.

    How is that possible? And what does that mean?

    Details:

    Here's the test protocol. The numbers indicate the bay the memory was originally in when I started; as I isolated the issue, I always put the memory in the Mac's first 1-2 bays.

    1. 4,3,2,1 - Errors
    2. 2,1 - Errors
    3. 4,3 - No errors
    4. 1 - No errors
    5. 2 - No errors
    6. 4,3,2,1 - No Errors

    But here's the thing. I noticed during the first two runs that when Memtest was starting up, on the startup screen with the copyright (©Silicon Image, etc...), there was an extra line:

    1 eSATA-2 ExternalRAID 3726GB

    noting the external I plugged in. So for the hell of it I unplugged it between Tests 2 and 3. Once I got down to the 6th test, which now showed no errors, I guessed was the external. Sure enough, testing it once more with the external on and all memory in, I got the same errors.

    So my question is what this means. The screen indicates Memtest is only testing the internal memory, so how could an external (a G-Tech 1TB RAID drive) prompt errors? Is it instead including the external's memory in its test?

    More importantly, does this mean that my external is somehow corrupting the on-board memory when it's on? I didn't even know externals had memory to corrupt, but does this mean that's something I should ask the manufacturer about?

    Screenshots:

    Above: first run, external drive connected

    Above: later in the first run

    A later pass with all memory in but the external OFF

    Thanks in advance for any insight(s) you might have.

  • #2
    There is a collection of interesting things in the screen shots.

    The first thing is that this Xeon was detected as only having 1 core. Do you know if you have changed anything to cause this? Otherwise it might be a bug.

    The second thing is that the total error count doesn't match the total of the individual test errors. We investigated this today and found this is the result of two bugs. Once which we accidentally introduced in the V4.2 release, causing the transition of 0 errors to 8 errors to appear as 80 errors. The other part of the problem was that if multiple tests find the same error (same address, same bits, etc..) then the error count in the right column isn't being updated. So the errors go unreported for a period. We have fixed up both of these issues for the next patch release.

    So what the screen shots really show is that you had 8 errors in the same memory location.

    The memory location is low in memory (under 1MB) so it is possible this is another bug similar to the USB keyboard bug. The UEFI BIOS might not be supplying the correct memory map to exclude the memory mapped I/O locations used by the disk controller. We are still looking into this.

    Can you check if Apple has put out any EFI BIOS updates for the machine that you don't already have installed. There are some details here,
    http://support.apple.com/kb/HT1237

    Comment


    • #3
      This error has some strong indications of a legitimate memory error. In particular the fact that it fails on only one test. The first thing I recommend would be to repeat the test a few time with and without the drive attached to make sure the symptoms are not coincidental.

      If the error is indeed associated with having the external drive connected then the BIOS is proving an erroneous memory map. Specificly indicating that memory is available to the OS that in fact is not.

      Comment


      • #4
        Thanks, David. I checked the link to the BIOS updates you provided, and my firmware is more recent than anything on the list, so presumably I'm up to date. Here's my data:

        Boot ROM Version: MP41.0081.B07
        SMC Version (system): 1.39f5

        If there's anything else you need from me, including additional testing, let me know.

        Also, re: # of cores, there should be 4:

        Last edited by MGiraud; Mar-22-2013, 04:25 PM.

        Comment


        • #5
          Thanks, cbrady. I did try it a few times with and without at the end, but you're right that I didn't spend much more time on it once I guessed the drive seemed to be the issue. If it's something that Passmark is interested in working on, I'd be happy to do more testing in a structured way that might really isolate the issues.

          Barring that, tho, it does seem like the external is the issue, and specifically the eSATA drive (I also have another drive connected, but via FireWire, which neither showed up on the memtest startup screen nor made a difference when I turned it off or on). I guess my question is still where the ultimate problem lies: my computer's memory or some defect in the external drive's hardware...

          Comment


          • #6
            It is similar to this issue.

            In that both the errors are in low RAM and occur on Test #6.

            No solution as yet however.

            Comment

            Working...
            X