Announcement

Collapse
No announcement yet.

Variable scores for Fonts & Text, Windows Interface, and PDF Rendering

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variable scores for Fonts & Text, Windows Interface, and PDF Rendering

    I recently purchased a Galaxy Book4 Pro 360 (155H CPU w/ NPU and Arc graphics) and did a fresh install of Windows 11 on it. Before the reinstall, and during various changes after, I ran some PerformanceTest benchmarks. A few strange things:
    • After the OS reinstall, my Fonts & Text score was half of what it was prior (there was one score after that was close to the original, but the other 5 runs were around 50%).
    • After the OS reinstall, my PDF Rendering score increased by about 65%.
    • When updating the NPU driver from the newest Samsung one to the more recent Intel one, Windows Interface scores dropped to about 85% of what they were originally.
    • Disk random read/write scores dropped by about 10% after OS reinstall, but this turned out to be due to device encryption being enabled by default.

    I have done my best to make all test conditions controlled, including watching temperatures, background processes, matching settings, disconnecting all external devices, etc. I restart between runs and let the computer return to idle temperatures between individual tests (on that note, some setting to go through the entire battery of tests with a configurable delay between the individual tests would be awesome).

    I have not be able to find anything that significantly affects these first three test results. My best theory is that somehow the NPU is involved, but I know very little about how and when the NPU is used.​

  • #2
    At the moment the NPUs in all CPUs are nearly useless. This is because
    1) There is no uniform way to program NPUs. So different code is required for different CPUs. This might change in 2025
    2) The documentation and code examples are really awful. Especially from AMD. But Intel & Qualcomm aren't much better. The AMD NPU is currently completely unusable.
    3) GPUs are faster than NPU. So it doesn't make sense to write code for the NPU if you have a half reasonable GPU. NPU is slightly more power efficient, but it doesn't really matter for a 20sec job.
    4) The hardware isn't available in many CPUs (so why write code for them) and the hardware capabilities are pretty different.
    5) Cloud based AI is better if you have an internet connection.

    All this means that your NPU probably isn't being used at all, ever.

    > there was one score after that was close to the original

    Maybe some of the results are "random". And the initial results before the WIn11 re-install might have also been random.
    And by random, I mean there is some external factor messing with the results that hasn't been identified. e.g. background tasks, power profiles.
    Could also just be device driver version changes. Most vendors aren't testing 2D performance as they consider it not very important. So they don't notice performance regressions in new drivers.




    Comment


    • #3
      Interesting about the NPU... So very application specific, and only in use when very specifically called for it seems? No chance Windows choosing to use it for a certain task?

      Most of the test results do have the expected random variability, but the ones I mentioned were switching between two fairly consistent scores:

      Click image for larger version

Name:	image.png
Views:	71
Size:	47.7 KB
ID:	57496

      PDF rendering and IOPS are pretty clear in when the change happened. Fonts and Text is odd because of that one higher score after reinstalling Windows. Windows interface is a bit strange too. It has a smaller decrease (but still consistent), and it happened after I replaced the latest Bluetooth/WiFi/NPU drivers from Samsung with those from Intel.

      Comment


      • #4
        Microsoft are talking about using it with Copilot+ PC's. But it has been mostly talk so far and your Intel CPU's NPU isn't included in this Copilot+ thing, at the moment (and maybe never will be).
        So likely your NPU is never used, for anything.

        Looking at your table, I agree something has changed. Maybe some Windows patch level or video card drivers.
        There has been a bunch of security patches over the last few years that resulted in performance loss on 2D. it wasn't just 2D that suffered, it was any application that called the operating system kernel functions quickly, in a tight loop, as the transition from user space to kernel space got slower to patch security bugs). This doesn't explain the PDF rendering getting faster however.

        In short: we don't know the reason.



        Comment


        • #5
          I really should have noted down some of the original driver versions before reinstalling Windows. I did run Windows and Samsung update before the testing...

          The only thing I still want to try is graphics drivers. I am currently using the latest from Samsung, but there are two newer ones available from Intel - Workstation and Game On.

          The Samsung one claims some unspecified customizations for the laptop, but it's a year and a half old at this point and lacks many updates. I'll install the Workstation driver first and retest, then the Game On driver and test again, since it *seems* like the Game On contains most/all of what is included in the Workstation.

          Anything else you think might be worth changing and retesting for these three tests results? I'm not complaining about the much higher PDF score, but it would still be nice to understand the reason for the drastic change.

          Comment


          • #6
            There has been enormous changes in the ARC graphics drivers over the last 18 months. The initial drivers for ARC were pretty bad for both performance and compatibility. But Intel has made a lot of improvements, so they aren't so bad now. I would be totally unsurprised to see large performance changes between different versions.


            Comment


            • #7
              Alright, after much more testing, I have been unable to isolate any variable that determines which of the two results I get for the Fonts and Text test.

              I did notice something though that makes me thing the issue might be with the test itself. On test runs where I get the higher score (585 or so), the test lasts for about 39 seconds. On runs where I get the lower score (315 or so), the test lasts for about 21 seconds. Assuming the test score is proportional to some number of operations or tasks completed during the test, then this seems to be an issue with the test not terminating properly. The score/time ratio is exactly the same, at least within my accuracy with a stopwatch.

              Here is a screenshot of an example of each test result, with task manager showing the GPU utilization during the tests.

              Click image for larger version

Name:	graphics usage 300 normal.png
Views:	67
Size:	343.2 KB
ID:	57502
              Click image for larger version

Name:	graphics usage 500 normal.png
Views:	65
Size:	352.5 KB
ID:	57503

              Comment


              • #8
                Can you post a debug for the fast and slow cases.
                https://www.passmark.com/support/per.../debug-log.php
                (not sure if it will help, but can't hurt).

                Comment


                • #9
                  I was lucky enough to get two instances of the higher score, so I made logs for each of them, and one for the lower score. I can only upload 3 files at once, so I will do the PerfTestLogs first and the SysInfoLogs in the next post.
                  Attached Files

                  Comment


                  • #10
                    And here are the SysInfoLogs. One thing I noticed relevant to the Fonts and Text test that was present in the PerTestLogs for both the high scores but was missing the the low scores were the following lines:

                    Code:
                    DEBUG PERF: DirectWriteTest::Initialize 1890 x 1050
                    DEBUG PERF: DirectWriteTest::CreateDeviceIndependentResources start
                    DEBUG PERF: DirectWriteTest::CreateDeviceIndependentResources DWriteCreateFactory
                    DEBUG PERF: CLSID_WICImagingFactory
                    DEBUG PERF: CreateTextFormat
                    DEBUG PERF: DirectWriteTest::CreateDeviceIndependentResources return 0
                    Debug: DirectWriteTest::RunTest start
                    Debug: DirectWriteTest::RunTest end
                    Attached Files

                    Comment


                    • #11
                      Oops, I should have looked closer. It's not that those lines don't appear in the low score PerTestLog, but that block I pasted above appears twice in succession in the high test logs.

                      Comment


                      • #12
                        We had a closer look at the Fonts & Text test. Seems there is a bug.
                        If both these conditions are true then the result can be double what is should be.
                        1) Machine has fast 2D (at least for rendering text)
                        2) Test duration is set to be reasonable long in the preference window.

                        In these conditions are true the test can run twice in the allowed test time, this is because the test was designed to loop until the test time duration is reached. But then a bug cases the result from both loops to be added, instead of averaged.

                        We didn't notice this bug before, because when this test was last updated, the hardware wasn't fast enough to complete two loops even with a long test duration. Seems it is now however.

                        The test also comprises 6 sub tests (scrolling, different fonts & zoom with two APIs). The scaling is also a bit off between the sub-tests for modern hardware, so we'll need to eventually look at that as well.

                        For the moment well put out a patch release to always force a single loop of the test (in V11 build 1020). This will fix the issue.

                        For the next major release we'll do a bigger overhaul of the test, which will likely change the score on all systems slightly.

                        Workaround for today is to set the test duration to be short.


                        Comment


                        • #13
                          Interesting, thanks for the detailed follow up!

                          Comment

                          Working...
                          X