Announcement

Collapse
No announcement yet.

Burnin v1019 & v1020 report GPGPU and 2D fail.

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Burnin v1019 & v1020 report GPGPU and 2D fail.

    Hi Passmark,
    We got GPGPU and 2D fail with Burnin v1019 & v1020 update.
    While v1017 and v1018 can't duplicate.
    The issue on Intel and Qualcomm platform both.
    There is no TDR and found application popup error on event log.(please refer below screenshot)
    Would you please take a look for this issue?
    Thanks~
    Click image for larger version

Name:	image.png
Views:	59
Size:	1.39 MB
ID:	60342

  • #2
    You haven't really provided enough details. Like the hardware you are using.

    There are also other Windows events that can indicate a GPU (or GPU driver) failure. So just looking for 1060 isn't sufficient.

    We have had a couple of reports of issues with the GPGPU test, but so far they all appear to be driver & hardware issues. We've tried some tweaks to our code to improve how we handle errors (e.g. report a timeout instead of just locking up), but so far the errors look like real errors.


    Comment


    • #3
      As background: In Version 10.2 build 1006, 18/April/202​ we converted existing GPGPU test from using the DirectCompute (HLSL) programming interface to OpenCL to support both Windows Display Driver Model (WDDM) GPUs, e.g. GeForce/Radeon and also Compute GPUs (TCC), e.g. Nvidia Telsa GPUs which have no video output.​

      There are now two code paths. BurnInTest will first check if OpenCL is supported by the system, if not, BurnInTest will fall back to the DirectCompute code path. Support is determined if BIT can load OpenCL.dll. BurnInTest will not mix-and-match GPGPU tests, e.g. if there are two videocards but even if only one supports OpenCL, then both will try to use OpenCL even if the second card does not support it.​

      The above has been mostly fine for a few years now.

      In Version 11.0 build 1018 and earlier, BurnInTest GPGPU test used and OpenCL function called "clFinish()" to send the work to the GPU. In V11.0.1019 we stopped using clFinish as on some machines there was a chance clFinish would block "forever", e.g. the test appeared stalled. We've seen this happen when video card drivers crashed, clFinish does not return in these situations, or possibly if there is heavy workload on the video card by other tests and applications. Once the stall / freeze / crash happened, there would be no recovery as the thread never got out of the OpenCL code.

      In V11.0.1019 the timeout is 10 secs which we believe is more than generous for the amount of work being sent to the device. So previously where there would be a freeze you would get a "No operations" error, you will now get a timeout message (like in your screen shot), and a chance at the test recovering for the next loop. So from your screen shot we can be sure the GPGPU test failed to do anything for 10 seconds. So likely a driver bug or crash.

      In any case, we made a further change. Give the following debug builds a try, we added a non-blocking clFlush to push the work to the GPU, however, if the work is still not completed within 10 secs, you will still receive the timeout error.

      Can you try these patch releases and let us know.

      x86-64: https://www.passmark.com/temp/BurnIn...g_20260205.exe
      ARM64: https://www.passmark.com/temp/BurnInTest_Windows_ARM64_dbg_20260205.exe​

      Comment


      • #4
        Hi Richard and David,
        Thanks for the feedback.
        We're now testing the dbg build and will update test result soon.

        Comment


        • #5
          Hi Richard.
          Verified 10 units pass with "debug 1020"
          Will it be a formal one to release for testing?
          Thanks~

          Comment


          • #6
            Thanks for testing, new public release with the same changes is now available:
            https://www.passmark.com/products/bu...t/download.php

            Comment


            • #7
              Hi Simon,
              Thanks for the new public release.
              We'll arrange more units to test then update result soon.

              Comment

              Working...
              X