Poor CPU Mark of PassMark v11.1 on Z840, 32K vs 44~48K

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • Keelung
    Junior Member
    • Oct 2023
    • 8

    #1

    Poor CPU Mark of PassMark v11.1 on Z840, 32K vs 44~48K

    The Passmark PerformanceTest v11.1 shows that my Dual E5-2699A V4 mark is 32644, not in range as others 44~48K.

    Both BIOS v02.61 and v02.62 are tested, got similar results. I even tried Clear CMOS button on the motherboard before testing.

    And the Windows power plan is also set to High Performance of cause.

    This is should not about the thermal issue. Because when I run AIDA64 CPU stress test(FPU only) more then 30 minutes, the maximum temperature is 68°C (CPU0) and 67°C(CPU1)

    I also tried run PerformaceTest in a Windows PE, and got similar results.

    And my Baseline ID is https://www.passmark.com/baselines/V...d=333174120742

    Click image for larger version

Name:	Z840_CPU_MARK.png
Views:	36
Size:	73.1 KB
ID:	60384
    Attached Files
  • David (PassMark)
    Administrator
    • Jan 2003
    • 11029

    #2
    It is hard to get great performance from dual socket systems. There are lots of ways to mess up the configuration. Especially the RAM setup.

    In your case your CPUs each have a quad channel interface. Meaning that each socket needs at least 4 sticks of RAM for optimal performance and less NUMA impact. (So 8 sticks of RAM would be ideal).

    But you system seems to have only 2 RAM sticks. Which is 6 short of a picnic.

    Which also explains this
    Click image for larger version

Name:	image.png
Views:	52
Size:	19.6 KB
ID:	60387

    Comment

    • Keelung
      Junior Member
      • Oct 2023
      • 8

      #3
      David, Thanks for pointing out the NUMA issue.
      When I building my home workstation Z840, my hardware colleague told me that I can use less sticks of RAM to get more reliability.
      So I bought two 32GB sticks of RAM for two CPUs. The 64GB memory size is enough for my daily usage.

      After learning NUMA a little, from my understanding currently, if no applications requiring more then available local memory, then no need to alloc remote memory by OS?
      The only thing I take care of is the CPU mark, not the system mark.

      I take care of the CPU mark because I found that there is no improvement in calculating sha256sum of many files on NVME SSD parallelly, after upgrading my two E5-2696 V4 to two E5-2699A V4 in my Z840 workstation.

      I want to try adding more memory sticks, but they're too expensive now.
      I spent 400+ RMB to get 2x32GB DDR4 memory about two years ago.
      But now even one 32GB memory costs more than 1000 RMB in the China market.

      Comment

      • David (PassMark)
        Administrator
        • Jan 2003
        • 11029

        #4
        I can use less sticks of RAM to get more reliability
        Technically correct. Half the hardware half is the chance of failure. But by this logic, using no hardware would be the best option. (No hardware == no chance of failure). But 8 x 8GB has the same number of bits as 2 x 32GB. So this isn't really half the hardware. So I don't think this was great advice to get two sticks for this system.

        Look at this table we made showing the impact of un-optimised 1 channel RAM vs 4 channel optimised.
        Difference can be up to 176%.
        Click image for larger version  Name:	image.png Views:	0 Size:	98.8 KB ID:	60390

        NUMA has an potential additional (negative) impact on top of this.

        if no applications requiring more then available local memory, then no need to alloc remote memory by OS?
        This isn't the entire story. it is a lot more complex. Windows doesn't schedule an entire application to a physical CPU and it's associated RAM. It schedules the individual threads within the application. Some effort is made to load balance. So you can end up with half the application running on CPU1 and half on CPU2, but all the RAM allocated on CPU1.
        If application developers spend a lot of effort they can manually control where everything runs, but there are too many hardware permutations for this to work well. So nearly no one does this. The result is often poor performance (seemingly at random).

        Also some versions of WIndows (like the Home editions) don't deal with multiple sockets at all.

        There are some (now slightly old) benchmark results of NUMA testing we did here.
        In the worst case the impact was 60%.

        So for you (in the probably rare) worst worst case you are giving up 178% + 60% performance gains (238% in total)

        Comment

        • Keelung
          Junior Member
          • Oct 2023
          • 8

          #5
          David, I am looking for the test results to preview possible improvement on my workstation. And your RAM test table is what I am looking for exactly!

          From the RAM test table, when memory channels doubled, about +20% benefit of CPU mark.
          And this benefit data can explain my CPU mark gap compared others: 32644 * 1.2 = 39173 and 32644 * 1.4 = 45702, in range of others: 44~48K finally!

          So, the root cause should be the memory configuration.
          I'll try. THANKS!!!

          Comment

          Working...