Single Thread Score rating

dylandog replied

Mar-21-2020, 09:13 AM
Originally posted by David (PassMark) View Post

I don't there there is any evidence that the Ryzen 3000 series has far superior IPC than the 9900K. It might have in some particular scenarios, but not across the board.

In V9 we ranked Ryzen 9 3900X and 9900K as basically the same (under 2% difference).
In V10 we (at the moment) are saying there is a 9% difference. The result will likely move around a bit more over the coming weeks.

The 9900K is clocked at 5Ghz, which is 400Mhz faster than 3900X. (9% difference).

In short: The algorithms changed (see previous posts). And as pointed out above, even tiny code changed can move results around a lot.

David there are many test and benchmark that show ryzen 3000 with a higher ipc and multicore then coffee lake (like here https://www.youtube.com/watch?v=DjBC_SzEKh4) and it seems that you don't want to accept it.....ryzen 3950x beat 9900ks in most single thread applications but this new update totally broken ryzen 3000.......while in gaming coffee lake is faster due to lower latency
Leave a comment:
BenchmarkManiac replied

Mar-20-2020, 07:18 PM
Originally posted by David (PassMark) View Post

The result will likely move around a bit more over the coming weeks.

So far it only getting worse. Ryzen 3950X is already losing even to 3600X. This is nonsense. It really looks like the test is made with no boost, like some heavy background task is run on the background or the single thread test is run just after the heavy multithreaded load and was very short. CPU are sorted by their base frequency.

Kaby Lake still outperform all the recent AMDs, this is also very strange.
Leave a comment:
CerianK replied

Mar-20-2020, 03:30 PM
Looking at recent benchmarks, many AMD builds cap the single-thread performance by locking the CPU to a speed well under its maximum rating.
This produces a lower single thread performance and increases the deviation in all results.
I just ran my son's new 3800X build and the single core result was about 2790, well above average since it was able to stretch up to 4.5GHz.
The new V10 results parallel my research into random number generation across multiple platforms. Kudos.
I had not looked at Rand() to notice any change and/or bias.
The V9 results were obviously skewed for comparison in that scenario (asm primitives), and it is unfortunate that the skew flipped in the other direction when V10 was first released.
Leave a comment:
David (PassMark) replied

Mar-20-2020, 03:01 AM
I don't there there is any evidence that the Ryzen 3000 series has far superior IPC than the 9900K. It might have in some particular scenarios, but not across the board.

In V9 we ranked Ryzen 9 3900X and 9900K as basically the same (under 2% difference).
In V10 we (at the moment) are saying there is a 9% difference. The result will likely move around a bit more over the coming weeks.

The 9900K is clocked at 5Ghz, which is 400Mhz faster than 3900X. (9% difference).

In short: The algorithms changed (see previous posts). And as pointed out above, even tiny code changes can move results around a lot.
Leave a comment:
claurie replied

Mar-19-2020, 03:25 PM
Don't get how Zen2 on Ryzen 3000 (3900X and others) with far superior IPC to the points it equals i9-9900K in single thread in most of the single threads tests (including the one in Passmark 9, Cinebench R15, Cinebench R20, etc....) is suddenly (or magically) dropping on new passmark 10 build 1004 single thread test. Ryzen 9 3900X even beats in gaming the new i9-10900K ( https://wccftech.com/intel-core-i9-1...nchmarks-leak/ ).
Leave a comment:
David (PassMark) replied

Mar-18-2020, 03:21 AM
People underestimate the amount of performance variation between machines in the wild. Even when using the same CPU. There are dozens of small and large effects that impact the measured CPU performance. It is doubly bad for laptops as vendors often ship them with single channel RAM and bad cooling.

Here are some examples for PerformanceTest V9.

Ryzen 5 Desktop CPU has a pretty nice looking bell curve (as would be expected for real life sampling of any variable). But the max and min extremes are still pretty far apart.

Here's a distribution graph for a laptop CPU. I am guessing that there were just a few popular vendors using this 7700HQ CPU (e.g. Dell, HP, etc..) and the two major laptop models perform differently from each other due to design decisions like RAM and cooling. Thus giving two peaks on the graph. Making the spread of results even wider than for desktop CPUs.
Leave a comment:
BenchmarkManiac replied

Mar-18-2020, 12:31 AM
Well. It is still odd that 3800X outperforms 3950X, but margin is very tiny and may be they'll swap when more samples are collected.
Seeing the 8th generation Coffee Lake processors among top 5 is also odd but may be may be.
I don't believe that Kaby Lake's 7700K outperform the latest AMD, that is wrong for sure.

Last edited by BenchmarkManiac; Mar-18-2020, 12:52 AM.
Likes 1
Leave a comment:
David (PassMark) replied

Mar-17-2020, 10:45 PM
Yes, should happen tomorrow.
(they would be diluted shortly in any case).

From calculations this morning (off a fairly small number of build 1004 samples), indicate results will look more like this tomorrow. Ignore the ridiculous number of decimal points, the numbers are nothing like that accurate.
Leave a comment:
BenchmarkManiac replied

Mar-17-2020, 12:57 PM
The inconsistent results of "PerformanceTest 10.1003 with Rand()" will be completely removed from the standings calculation formula shortly, right?
Leave a comment:
David (PassMark) replied

Mar-16-2020, 11:25 PM
An attempt is being made to follow up with Microsoft to see if we can get them to tell us what they did to Rand() to mess it up. My guess is that internally Rand() calls some Windows API and a lot of the API calls had there performance effected by the spectre, meltdown, etc.. security patches.

We had a quick look at the Rand() source code today (the part of it that was public). The core of it is pretty simple (just 2 lines of code in fact). So as BenchmarkManiac correctly pointed out obviously that can't be different from one Windows version to the next. But there is a bunch of extra code managing per thread CRT status structures (100s of lines of code) and it isn't all public. There was far more code for the memory management of the thread state than actually for generating random number. So there is maybe something in that code.

The new minstd_rand() function isn't a crypto type random. i.e. it isn't truly random. It is still pseudo random. So no special hardware acceleration & no context switches.

If the documentation is to be believed the new function is basically 1 line of code. Being,

Code:

x = x * 48271 % 2147483647
Leave a comment:
BenchmarkManiac replied

Mar-16-2020, 08:19 PM
You've given a link to standard C++ library rand() function, it is statically linked to the code and shouldn't be different among different Windows versions. If you are using some kind of crypto api Rand() implementation then it can be very very different in performance from one platform to another, it can cause context switches etc and absolutely shouldn't be part of a timed portion of the benchmark.
Leave a comment:
David (PassMark) replied

Mar-16-2020, 10:43 AM
Between rolling out the new graphs, PT10 release and dealing with impact of coronavirus today, I didn't get time to read all the posts above.

Hopefully they are all polite & factual. Might get to them tomorrow. Software patch probably addresses some of it anyway.
Leave a comment:

David (PassMark) replied

Mar-16-2020, 10:30 AM

A really interesting update:

As background: The single threaded test is an aggregate of the floating point, string sorting and data compression tests (each of them are run in series on 1 core). The compression test uses Crypto++ Gzip (based on the DEFLATE compression algorithm). This tests uses memory buffers totaling about 4MB per core.

AMD were kind enough to take a look at the single thread results, pull part the code from PerformanceTest v10 to see what was being executed and contacted us about it.

From the 3 sub-tests, the data compression test was pulling the AMD 3000 series down the most (relative to other CPUs).

Deeper analysis on the data compression test showed that it wasn't doing as much compression as expected, it was spending an unexpected large portion of its time generating random data to be compressed. Generating random numbers was always part of the test, but it should have been a small part.

So this is the interesting part. We compared different Window’s releases for the CPU Compression test. There was a 15% drop in the compression benchmark between Win10 Build 10240 & Win10 Build 18362 (we don't know exactly which patch caused the problem, but speculation is that it was one of the many security fixes). So it seems clear that patches on Windows significantly slowed down the test and the function that became slower was Rand(). But the aim of this test was not to measure the performance of Rand(), nor measure the impact of that security patch. So we decided to change it. We can't have a situation where different Win10 versions so significantly impacts the CPU score. (that might be fine for the 2D score, but not the CPU score)

We changed Rand() for minstd_rand. Which is a different random number generator algorithm. Basically a two line change in the code without altering the functionality.

Code:

+    std::minstd_rand rng(RAND_SEED);
-        pbDataBuffer[i] = (rand() % 27) + 96;
+        pbDataBuffer[i] = (rng() % 27) + 96;

This was the impact on the single threaded test.

CPU Model	PerformanceTest 10.1003 with Rand()	PerformanceTest 10.1004 with minstd_rand()	Increase in benchmark result
i7-8700k	2,778	3,003	8.1%
i3-4160	1,871	2,075	10.9%
FX-8120	1,405	1,652	17.6%
Ryzen 9 3900	2,529	3,022	19.5%
Ryzen TR 3970X	2,500	2,997	19.9%

So all CPUs benefited. But AMD 3000 series got the most benefit.
After scaling all the CPU results back down to PT9 levels, the net benefit to the 3000 series should be around 9%. Which should help close the expectation gap people are referring to above. We won't know exactly until we get a few hundred new results come it.

But to be clear, the old code was totally valid code & rand() has been the goto function for random number generation for 30+ years. Rand was used everywhere historically. Microsoft have now given it inconsistent performance however.

It's a good illustration of just how fickle CPUs are to different code. And to some degree what folly it is to rely on a single benchmark. People need to look at a range of results. It isn't reasonable to expect all benchmarks (or real life apps) to get consistent results across a range of CPUs. Despite the fact that this change should bring us slightly closer to community expectation, we are still of the opinion that diversity of results highlighting different aspects of a CPU is a good thing. (It really isn't a good thing if all benchmarks match Cinebench and POV).

This change has been rolled out in PerformanceTest V10 build 1004.

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: