Announcement

Collapse
No announcement yet.

PerformanceTest Baseline Uploading - New build required

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PerformanceTest Baseline Uploading - New build required

    We have just completed a migration to a new web server but they were unable to move some of the old features across (FTP logins required for PerformanceTest baseline uploading).

    New Windows x68/ARM and Linux x86/ARM builds are now available.

    Versions older than V10.1.1006 (including all V9 builds) will no longer be able to upload baselines moving forward.
    Last edited by Tim (PassMark); May-18-2022, 10:55 PM.

  • #2
    We have been able to have our server modified so baseline uploads from older version of PerformanceTest can continue however this is only a temporary measure and we still recommend downloading the latest V10 build.

    Comment


    • #3
      Something's broken in the new PassMark 11.1. I've run hundreds of tests in version 10.1, and the results were always better. The 11.1 results are underestimated and shouldn't be combined with the 10.x results database because they're misleading and lower the processor rankings.

      Below is a comparison. Clearly, something has changed in Integer Math (always below 200,000 in 11.1 and always above 211,000 in 10.1), Prome Numbers, and Sorting. Either the algorithms have changed, or something in the new version is choking these calculations (it interferes with these tests).

      In the RAM test, something has also changed in Database Operations, as the results have now increased by several percent.​

      11.1

      Click image for larger version  Name:	image.png Views:	0 Size:	62.3 KB ID:	59615

      10.1

      Click image for larger version

Name:	Zrzut ekranu 2025-08-12 083220.png
Views:	70
Size:	145.4 KB
ID:	59617​​​
      Attached Files

      Comment


      • #4
        There were big algorithm changes between V9 and V10.
        The changes between V10 and V11 in the CPU test suite were smaller and the numbers are largely comparable. Your overall difference was 1.8%. And I suspect if you ran both versions 10 times and took an average the difference might even be smaller.

        But none the less, best practice would be to compare benchmarks from exactly the same version.

        I did some testing today on a Ryzen 5 5600X with V10 and V11. Here are the results. (6 runs on each version, then take the average). Difference was 0.99%.
        The difference between individual runs was more than the difference between the versions, on average.

        But the variation between runs was higher in V10. This might have just been down to chance however. Modern windows has so much background activity and power / thermal control activity, it is hard to get a clean consistent run.

        Click image for larger version

Name:	image.png
Views:	86
Size:	11.0 KB
ID:	59624

        Comment


        • #5
          Originally posted by David (PassMark) View Post
          There were big algorithm changes between V9 and V10.
          The changes between V10 and V11 in the CPU test suite were smaller and the numbers are largely comparable. Your overall difference was 1.8%. And I suspect if you ran both versions 10 times and took an average the difference might even be smaller.

          But none the less, best practice would be to compare benchmarks from exactly the same version.

          I did some testing today on a Ryzen 5 5600X with V10 and V11. Here are the results. (6 runs on each version, then take the average). Difference was 0.99%.
          The difference between individual runs was more than the difference between the versions, on average.

          But the variation between runs was higher in V10. This might have just been down to chance however. Modern windows has so much background activity and power / thermal control activity, it is hard to get a clean consistent run.

          Click image for larger version  Name:	image.png Views:	0 Size:	11.0 KB ID:	59624
          Thanks!..

          You checked the performance of the entire computer, but by overclocking the CPU and RAM or optimizing their performance (as I like to do without increasing voltages and using factory settings for PPT) and comparing CPU and partial results with other owners of 5950X/7950X/ 9950X, etc. in your database makes a difference

          You could only compare the results of prime number calculations (325 vs. 240!), Integer Math, and Sorting... Because something here went wrong in version 11 and performance dropped significantly. This is not a coincidence or the influence of background processes, because I do benchmarks with unnecessary processes turned off, and I gave version 11 several chances and it was always the same. (I also checked version 11 over a year ago and it was similar)

          On the 5950X, I had results in the TOP20 worldwide or 2nd place in the world in the Vray benchmark 6, and I like to use PerformanceTest alternately with Geekbench 5 (6 is scam...) because of the variety of calculations. PerformanceTest is the only one that allows to conveniently compare results for the same hardware with results from around the world



          Comment


          • #6
            Even in my runs there was the 72,394 result from V10 that was slightly higher than all the V11 runs. But only by 2%. So I think you are probably partially right, a small performance difference is possible in the integer maths test. But I wouldn't classify this as "broken". The difference could be explained by us moving to a new compiler version that does better code optimization (this is just speculation, as we haven't really investigated this small difference).

            V10 is pretty much dead now anyway, we are moving on to V12 shortly.

            Comment


            • #7
              Originally posted by David (PassMark) View Post
              Even in my runs there was the 72,394 result from V10 that was slightly higher than all the V11 runs. But only by 2%. So I think you are probably partially right, a small performance difference is possible in the integer maths test. But I wouldn't classify this as "broken". The difference could be explained by us moving to a new compiler version that does better code optimization (this is just speculation, as we haven't really investigated this small difference).

              V10 is pretty much dead now anyway, we are moving on to V12 shortly.

              The overall result doesn't change much because individual subtests have different weights and impacts. But I actually reran the tests in version 10.2 when I had some free time, and that's where the real problem occurred, which has been carried over to version 11.x.

              In 10.2, it's probably not the newer compiler version that's to blame, and it's not the same problem as in 11.x.

              I did a comparison; look at the huge difference in Prime Numbers! That's a 35% difference So even if it were a newer compiler, it would be a bit underdeveloped, calculating prime numbers much slower Similarly, Integer Math is always smaller in 10.2 and 11.x. I've run hundreds of tests on 10.1 (optimizing CPUs) and several dozen on 10.2 and 11.x, and it's repetitive. Sorting - something happened in 10.2, and the results were over 80,000 (even the scale ended, where are the results from the 9950X, etc., on the graph and scale? This is My CPU vs. the World, and they should be there. And from Threadrippers). In 11.x, they're back to normal and practically identical to 10.1, so it's probably not the compiler's fault, but a different algorithm was probably used in 10.2 for these calculations.

              Check this:


              Click image for larger version

Name:	image.png
Views:	36
Size:	169.6 KB
ID:	59933​​
              Click image for larger version

Name:	image.png
Views:	34
Size:	260.0 KB
ID:	59934


              Another strange thing is the 3D tests of the graphics card... 11.x results are always almost 3x lower in DirectX 11 and DirectX 12 than in 10.1 / 10.2!!
              Click image for larger version

Name:	image.png
Views:	35
Size:	240.9 KB
ID:	59935​Look at the impact this has on the graphics card positioning percentage ranking 73% vs. 87%

              So... What compiler did you use?




              I hope you'll fix this in version 12.

              And what was suppressing/slowing down some calculations in 10.2 and 11.x will be found in the code and removed Then we'll be able to easily test newer processors in 12 and compare results with other CPUs and CPU users.​

              Comment


              • #8
                As noted above if you are doing comparisons it is best to use the same software versions as much as possible.
                You should also do a number of runs and average the result (or take the max). The variance of one run can be misleading if you are looking at small differences.

                For the 3D tests we deliberately made them more complex in V11, compared to V10. This was done to better load more modern video cards. There was never any intent to make the individual 3D test results comparable between releases. The 3D test will get even more complex in V12.
                You can find a list of the changes in V11 here.

                For the CPU tests we did intend to make them fairly comparable between V10 and V11. They are changing in V12 however (in part to deal with increasing cache sizes and new instruction set extensions).

                I did some testing here. I found
                1) The string sorting test to be faster in V10.2.1017 compared to prior V10.x releases and V11.x. In my testing the difference was around 8%
                2) The prime number test to be faster in both V10.2.1017 and V11.x, as compared to V10.1.x. Difference was around 13%.

                In V11 we switched to using Visual Studio 2022 as the compiler. While V10 was built using VS2019. There were also some bug fixes and a bunch of minor changes. So some minor performance differences is to be expected.

                However, and this is the interesting bit, it appears that V10.2.1017 (the final V10 release) was also built with VS2022. Whereas V10.2.1016 was built with VS2019. This looks to be a mistake on our side as we try and avoid any compiler changes when doing patch releases. V11 was built using VS2022, but a slightly newer patch release compared to V10.2.1017.

                There were no source code changes to these tests on x86. There were some changes for the ARM release, as that we still new at the time.

                If you bindump the PerformanceTest .exe file headers for each of the versions you get this.

                V11.1.1007
                time date stamp Wed Sep 24 08:05:45 2025
                14.36 linker version​

                V10.2.1017
                ​time date stamp Mon Feb 20 15:51:32 2023
                14.31 linker version​

                V10.2.1016
                time date stamp Tue Jan 31 22:06:48 2023
                14.24 linker version​

                The linker versions relate to the version of Visual Studio used. So 3 slightly different versions. Which I think explains what you have seen.
                Obviously this isn't perfect, but there is no real reason to use V10 anymore and we don't think it makes sense to go back and fix it up when V12 is close.

                TLDR: Different compiler versions can give different performance. Looks like we accidentally used VS2022 for the final V10 release. Probably because we were already deep into the V11 development process, which was using VS2022.

                Comment

                Working...
                X