Announcement

Collapse
No announcement yet.

Single Thread Score rating

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • dylandog
    replied
    Originally posted by David (PassMark) View Post
    I don't there there is any evidence that the Ryzen 3000 series has far superior IPC than the 9900K. It might have in some particular scenarios, but not across the board.

    In V9 we ranked Ryzen 9 3900X and 9900K as basically the same (under 2% difference).
    In V10 we (at the moment) are saying there is a 9% difference. The result will likely move around a bit more over the coming weeks.

    The 9900K is clocked at 5Ghz, which is 400Mhz faster than 3900X. (9% difference).

    In short: The algorithms changed (see previous posts). And as pointed out above, even tiny code changed can move results around a lot.
    David there are many test and benchmark that show ryzen 3000 with a higher ipc and multicore then coffee lake (like here https://www.youtube.com/watch?v=DjBC_SzEKh4) and it seems that you don't want to accept it.....ryzen 3950x beat 9900ks in most single thread applications but this new update totally broken ryzen 3000.......while in gaming coffee lake is faster due to lower latency

    Leave a comment:


  • BenchmarkManiac
    replied
    Originally posted by David (PassMark) View Post
    The result will likely move around a bit more over the coming weeks.
    So far it only getting worse. Ryzen 3950X is already losing even to 3600X. This is nonsense. It really looks like the test is made with no boost, like some heavy background task is run on the background or the single thread test is run just after the heavy multithreaded load and was very short. CPU are sorted by their base frequency.

    Kaby Lake still outperform all the recent AMDs, this is also very strange.


    Leave a comment:


  • CerianK
    replied
    Looking at recent benchmarks, many AMD builds cap the single-thread performance by locking the CPU to a speed well under its maximum rating.
    This produces a lower single thread performance and increases the deviation in all results.
    I just ran my son's new 3800X build and the single core result was about 2790, well above average since it was able to stretch up to 4.5GHz.
    The new V10 results parallel my research into random number generation across multiple platforms. Kudos.
    I had not looked at Rand() to notice any change and/or bias.
    The V9 results were obviously skewed for comparison in that scenario (asm primitives), and it is unfortunate that the skew flipped in the other direction when V10 was first released.

    Leave a comment:


  • David (PassMark)
    replied
    I don't there there is any evidence that the Ryzen 3000 series has far superior IPC than the 9900K. It might have in some particular scenarios, but not across the board.

    In V9 we ranked Ryzen 9 3900X and 9900K as basically the same (under 2% difference).
    In V10 we (at the moment) are saying there is a 9% difference. The result will likely move around a bit more over the coming weeks.

    The 9900K is clocked at 5Ghz, which is 400Mhz faster than 3900X. (9% difference).

    In short: The algorithms changed (see previous posts). And as pointed out above, even tiny code changes can move results around a lot.

    Leave a comment:


  • claurie
    replied
    Don't get how Zen2 on Ryzen 3000 (3900X and others) with far superior IPC to the points it equals i9-9900K in single thread in most of the single threads tests (including the one in Passmark 9, Cinebench R15, Cinebench R20, etc....) is suddenly (or magically) dropping on new passmark 10 build 1004 single thread test. Ryzen 9 3900X even beats in gaming the new i9-10900K ( https://wccftech.com/intel-core-i9-1...nchmarks-leak/ ).

    Leave a comment:


  • David (PassMark)
    replied
    People underestimate the amount of performance variation between machines in the wild. Even when using the same CPU. There are dozens of small and large effects that impact the measured CPU performance. It is doubly bad for laptops as vendors often ship them with single channel RAM and bad cooling.

    Here are some examples for PerformanceTest V9.

    Ryzen 5 Desktop CPU has a pretty nice looking bell curve (as would be expected for real life sampling of any variable). But the max and min extremes are still pretty far apart.

    Click image for larger version

Name:	Distribution-Ryzen-3600.png
Views:	1010
Size:	20.0 KB
ID:	46976

    Here's a distribution graph for a laptop CPU. I am guessing that there were just a few popular vendors using this 7700HQ CPU (e.g. Dell, HP, etc..) and the two major laptop models perform differently from each other due to design decisions like RAM and cooling. Thus giving two peaks on the graph. Making the spread of results even wider than for desktop CPUs.

    Click image for larger version

Name:	Distribution-7700HQ.png
Views:	961
Size:	22.0 KB
ID:	46977

    Leave a comment:


  • BenchmarkManiac
    replied
    Well. It is still odd that 3800X outperforms 3950X, but margin is very tiny and may be they'll swap when more samples are collected.
    Seeing the 8th generation Coffee Lake processors among top 5 is also odd but may be may be.
    I don't believe that Kaby Lake's 7700K outperform the latest AMD, that is wrong for sure.
    Last edited by BenchmarkManiac; Mar-18-2020, 12:52 AM.

    Leave a comment:


  • David (PassMark)
    replied

    Yes, should happen tomorrow.
    (they would be diluted shortly in any case).

    From calculations this morning (off a fairly small number of build 1004 samples), indicate results will look more like this tomorrow. Ignore the ridiculous number of decimal points, the numbers are nothing like that accurate.


    Click image for larger version  Name:	SingleThread-Build 1004.png Views:	0 Size:	23.2 KB ID:	46971


    Leave a comment:


  • BenchmarkManiac
    replied
    The inconsistent results of "PerformanceTest 10.1003 with Rand()" will be completely removed from the standings calculation formula shortly, right?

    Leave a comment:


  • David (PassMark)
    replied
    An attempt is being made to follow up with Microsoft to see if we can get them to tell us what they did to Rand() to mess it up. My guess is that internally Rand() calls some Windows API and a lot of the API calls had there performance effected by the spectre, meltdown, etc.. security patches.

    We had a quick look at the Rand() source code today (the part of it that was public). The core of it is pretty simple (just 2 lines of code in fact). So as BenchmarkManiac correctly pointed out obviously that can't be different from one Windows version to the next. But there is a bunch of extra code managing per thread CRT status structures (100s of lines of code) and it isn't all public. There was far more code for the memory management of the thread state than actually for generating random number. So there is maybe something in that code.

    The new minstd_rand() function isn't a crypto type random. i.e. it isn't truly random. It is still pseudo random. So no special hardware acceleration & no context switches.

    If the documentation is to be believed the new function is basically 1 line of code. Being,
    Code:
    x = x * 48271 % 2147483647

    Leave a comment:


  • BenchmarkManiac
    replied
    You've given a link to standard C++ library rand() function, it is statically linked to the code and shouldn't be different among different Windows versions. If you are using some kind of crypto api Rand() implementation then it can be very very different in performance from one platform to another, it can cause context switches etc and absolutely shouldn't be part of a timed portion of the benchmark.

    Leave a comment:


  • David (PassMark)
    replied
    Between rolling out the new graphs, PT10 release and dealing with impact of coronavirus today, I didn't get time to read all the posts above.

    Hopefully they are all polite & factual. Might get to them tomorrow. Software patch probably addresses some of it anyway.

    Leave a comment:


  • David (PassMark)
    replied
    A really interesting update:

    As background: The single threaded test is an aggregate of the floating point, string sorting and data compression tests (each of them are run in series on 1 core). The compression test uses Crypto++ Gzip (based on the DEFLATE compression algorithm). This tests uses memory buffers totaling about 4MB per core.

    AMD were kind enough to take a look at the single thread results, pull part the code from PerformanceTest v10 to see what was being executed and contacted us about it.

    From the 3 sub-tests, the data compression test was pulling the AMD 3000 series down the most (relative to other CPUs).

    Deeper analysis on the data compression test showed that it wasn't doing as much compression as expected, it was spending an unexpected large portion of its time generating random data to be compressed. Generating random numbers was always part of the test, but it should have been a small part.

    So this is the interesting part. We compared different Window’s releases for the CPU Compression test. There was a 15% drop in the compression benchmark between Win10 Build 10240 & Win10 Build 18362 (we don't know exactly which patch caused the problem, but speculation is that it was one of the many security fixes). So it seems clear that patches on Windows significantly slowed down the test and the function that became slower was Rand(). But the aim of this test was not to measure the performance of Rand(), nor measure the impact of that security patch. So we decided to change it. We can't have a situation where different Win10 versions so significantly impacts the CPU score. (that might be fine for the 2D score, but not the CPU score)

    We changed Rand() for minstd_rand. Which is a different random number generator algorithm. Basically a two line change in the code without altering the functionality.

    Code:
    +    std::minstd_rand rng(RAND_SEED);
    -        pbDataBuffer[i] = (rand() % 27) + 96;
    +        pbDataBuffer[i] = (rng() % 27) + 96;
    This was the impact on the single threaded test.
    CPU Model PerformanceTest 10.1003
    with Rand()
    PerformanceTest 10.1004
    with minstd_rand()
    Increase in benchmark result
    i7-8700k 2,778 3,003 8.1%
    i3-4160 1,871 2,075 10.9%
    FX-8120 1,405 1,652 17.6%
    Ryzen 9 3900 2,529 3,022 19.5%
    Ryzen TR 3970X 2,500 2,997 19.9%
    So all CPUs benefited. But AMD 3000 series got the most benefit.
    After scaling all the CPU results back down to PT9 levels, the net benefit to the 3000 series should be around 9%. Which should help close the expectation gap people are referring to above. We won't know exactly until we get a few hundred new results come it.

    But to be clear, the old code was totally valid code & rand() has been the goto function for random number generation for 30+ years. Rand was used everywhere historically. Microsoft have now given it inconsistent performance however.

    It's a good illustration of just how fickle CPUs are to different code. And to some degree what folly it is to rely on a single benchmark. People need to look at a range of results. It isn't reasonable to expect all benchmarks (or real life apps) to get consistent results across a range of CPUs. Despite the fact that this change should bring us slightly closer to community expectation, we are still of the opinion that diversity of results highlighting different aspects of a CPU is a good thing. (It really isn't a good thing if all benchmarks match Cinebench and POV).

    This change has been rolled out in PerformanceTest V10 build 1004.

    Leave a comment:


  • bengo321
    replied
    look for example in your own table, there is i7-4770k@3.5GHz and i5-4690k@3.5Ghz, with PT9 these CPUs had 2249 and 2235, which sounds pretty good (same clock and gen), but now with PT10 the i7 is rated at 1962 and the i5 at 2195, back in the days where I buyed this i5 the test scaled with clockspeed when you OCed it, now you cannot use it for a simple comparison in one productline. (these two CPUs are just an example, there are more of these were it doesnt look right at all)
    I would really like to get more at least linear data in things that can kinda be compared like apples to apples. In the way the ratings are atm you cannot even use it for rough estimation how a system is performing, exept you know how old PT9 looks like and how it changed to PT10 and how this is similar to other benchmarks and how not. And at that point, for me, I will just use another Benchmark, sad because I was used to use this benchmark because it was fast to use and find for each CPU

    Leave a comment:


  • demonsavatar
    replied
    Originally posted by David (PassMark) View Post
    So for this real life app we see the following
    (remember lower score are better for this POV rendering test)

    Intel Core i7-9700K @ 3.60GHz
    POV-Ray result: 505

    AMD Ryzen 7 2700X
    POV-Ray result: 633 (25% slower)

    AMD Ryzen 5 2600X
    POV-Ray result: 636 (26% slower)

    So these results line up pretty well with PT10.

    Cinebench (single threaded) gives results of 16% and 18%.
    Y-Cruncher (single threaded) gives results of 88% and 95% (big difference is due to AVX instructions in Intel's CPU)

    So if we accept that the Ryzen 3000 series gives +13% performance over the 2000 series, then that still puts them around 10 to 15% behind the best Intel CPUs.

    I am sure it is possible cheery pick counter examples, but hopefully the majority of people will see the new results as an improvement over what we had.
    Originally posted by HwGeek View Post
    Dear David,
    As you see the problem Zen 2.0 scores, even if you compare the bench from TH, the Zen 2.0 ST performance is better then Coffee Lake and only the 5Ghz 9900K/S can match it.
    on AVG the 9900K and Ryzen 3900X/3950X should be in ~5% margin, not ~20% like the PT10 ST scores show.
    The discrepancy between the data you guys posted is interesting. The problem with a lot of the tests that you guys picked out is they all use AVX. That means 2700X and 2600X are at a huge disadvantage against 9700K, since they don't have 256-bit wide AVX (they split it into two 128-bit instructions). Zen2 does have full 256-bit wide AVX, so the gap closes (or even surpasses) vs Intel in the data HwGeek shows. I think Y-Cruncher chart HwGeek posted shows this the best, with all the last gen ThreadRippers doing worse than any Zen2 proc by far, but 9980XE outperforming everything just because of AVX512.

    Leave a comment:

Working...
X