Announcement

Collapse
No announcement yet.

RAM errors only after Win7 restart

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by David (PassMark) View Post
    Interesting.
    Are the errors always being reported in exactly the same small range of memory addresses?
    Think yes

    Well, I've been searching past few weeks about this problem, but didn't found this thread at tomshardware. It seems the guy have exactly the same problem

    Comment


    • #17
      I run the Memtest last night (restarted from Windows), for the whole night, 3 passes went without errors (of course, USB3 disabled in BIOS)
      I guess, I can state for sure the problem is only with NEC USB3 controller enabled
      So here is my understanding what's happening (please correct me if I'm wrong)
      With operating system loaded, there is a lot writing to the RAM, but when operating system shuts down, BIOS clears everything from it. Because of the bug in MB BIOS/NEC firmware, there is missed erasing of RAM space where USB3 controller has wrote something. So in this RAM space there is something writen until turning off the power supply. When the OS starts again, there is still something writen in that specific RAM space, but this time operating system cannot read it. So, if happens some writing to this RAM space, then BSOD appears

      Am I right with this?
      So this implies, the hardware is OK, only the software bug??

      Now, when I remember BSOD's, most of them happened at Win booting, and this is more logical - if accidently OS writes something to this "bad" address, there is a BSOD because most of sensitive data that OS writes, happens at booting process. And when OS is loaded, there is much smaller chance to write something sensitive at "bad" address.

      Comment


      • #18
        It is hard to be sure exactly what happens without having the machine in front of us and doing a detailed forensic investigation of the problem.

        But speculation is this,

        • PCs, for at least the last 20 years, have used memory mapped I/O. See, http://en.wikipedia.org/wiki/Memory-mapped_I/O
        • Each device (video card, sound card, USB controllers, etc..) potentially uses some of the memory addresses normally assigned to main ram. Effectively replacing / overlaying the RAM. So when you write a value to these special mapped I/O addresses you are writing to the device at that address and not the RAM.
        • The range of addresses used by the device is variable. It might be a few bytes, or megabytes of RAM. If the device is disabled in BIOS then it shouldn't use any addresses.
        • Maybe (and this is the speculation), when you boot into Windows, the device driver activates the USB3 controller when the device driver loads. So it starts doing memory mapped I/O at this point. Or mmio over an expanded address range.
        • Further speculation is that because this was the first motherboard to support USB3, the BIOS doesn't know much about it. Normally the BIOS should update it's memory map so that MemTest86 and Windows know what addresses are safe to use as free RAM. See below for an example of what a memory map looks like.
        • Once the USB3 controller (the NEC chip in this case) is activated in Window. It stays active, even through a reboot. Implying as well that the BIOS fails to set the USB3 controller into a known state on a reboot.
        • MemTest86 works by writing data into RAM and reading the values back. But the USB3 chip's memory mapped addresses won't work like this. Writing to an address that the USB3 controller manages might instead send that value out across the USB port. Reading a value might, for example, read a value from the USB port. From the point of view of MemTest86 the values read are going to be random. Thus MemTest86 flags it as an error. MemTest86 should not test this area of memory, but it doesn't know this as the BIOS memory map is apparently wrong.
        • It is also possible that even Windows doen't know these addresses are off limits once USB3 is active. This would provoke a BSOD and other random behaviour (and probably the odd random USB fault)



        Example of what a BIOS memory map looks like

        Comment


        • #19
          Thanks for this explanation.
          So what next?
          Is it useful to check the BIOS memory map (with USB3 enabled), and how to do it?
          Or should I write to Gigabyte developers to explore this problem and correct the BIOS? (but I doubt they will fix it because this is an old motherboard)

          Comment


          • #20
            Yes, you could try reporting the issues to Gigabyte. But given it is a pretty technical issue on a older motherboard I think you would have to fight with them over a long period for them to even investigate the issue, let only fit it.

            From MemTest86's point of view, it will work if you cold boot the machine, so for RAM testing I would suggest doing this.

            If you believe the BIOS problem also makes Windows unstable, you can the disable USB3 on the motherboard and buy a PCI-E USB3 card, or just use USB2. (Buy a quality PIC-E if you do buy one, some of the USB3 cards we tested weren't very good).

            Comment


            • #21
              Thank you David!
              I wrote to Gigabyte, so now I'm waiting.
              I'll post here the answer/solution

              Comment


              • #22
                After numerous messages sent to Gigabyte, I'm giving up
                Their conclusion is memory compatibility issue.
                David, thank you very much for your effort and help!
                Without your help, I probably would not have discovered alone, this problem with USB3 controller

                Comment


                • #23
                  It doesn't surprise me that you got a canned answer. The last fault I reported to Gigabyte resulted in similar vague superficial response.

                  You could swap the RAM and then report that the problem isn't fixed as a way of showing it isn't a compatibility problem. But I can't imagine that they'll every properly investigate and fix the problem. Too big to care, and they'll have a newer model out next week, so why maintain the old one.

                  Comment

                  Working...
                  X