We have had a couple of reports of poor performance results with the new P4 CPUs when hyper threading is enabled. We now have a Hyper threaded (HT) test machine and have done some testing with PerformanceTest.
This confirmed that the standard tests in version 4.0 don't show great results with hyperthreading enabled and that the poor results are not because of a "bug" in PerformanceTest. We initially thought that the MaxMegaFLOPS test should give some good results with HT enabled. However there are some test scenarios in which HT does very badly in. The MaxMegaFLOPS test is one of those scenarios. These scenarios tend to be a slightly artificial however, because all the test threads are doing the same activity at the same time. HT works best if there is a variety of different activity happening simultaneous.
We have now created a more realistic multithreaded test.
Here is the link to the new (currently pre-release) version of PerfomanceTest (Version 4.0 build 1009):
(Link removed - 10/Dec/2003 - Final release version is now available. Download from: http://www.passmark.com/download/index.htm)
The two main changes in this version are the Advanced Multi-process testing window and the updated Advanced Disk Testing window. The Advanced Multi-process testing shows some significant improvement with HT (around +20%).
See the sample results below.
Multiple Process Test Results Hyper Threading OFF
CPU Type: Pentium 4 2.8Ghz
Memory Information: 510 MB RAM in 1 slots
Test Results
Prime Number Search 41538 Primes Found
Sorting Random Strings 107782 Thousand strings sorted per sec.
Compression 176 KBytes per sec.
Encryption 782.57 KBytes per sec.
Image Rotation 15.22 Rotations per second
MMX Addition 100.10 Million Ops/Sec.
MMX Multiplication 100.10 Million Ops/Sec.
Integer Addition 56.02 Million Ops/Sec.
Whetstone 89.53 Million Ops/Sec.
Dhrystone 159.86 Million Ops/Sec.
Dhrystones per sec: 280878
Memory Read 164.15 MB/Sec.
Memory Write 164.25 MB/Sec.
Disk Access 0.06 MB/Sec. Sequential, 50% Read/50% Write.
Multiple Process Test Results Hyper Threading ON
CPU Information: 2xPentium 4 (HT)
Memory Information: 510 MB RAM in 1 slots
Test Results
Prime Number Search 49098 Primes Found
Sorting Random Strings 138167 Thousand strings sorted per sec.
Compression 230 KBytes per sec.
Encryption 1053.25 KBytes per sec.
Image Rotation 17.61 Rotations per second
MMX Addition 127.25 Million Ops/Sec.
MMX Multiplication 127.34 Million Ops/Sec.
Integer Addition 57.59 Million Ops/Sec.
Whetstone 176.06 Million Ops/Sec.
Dhrystone 198.60 Million Ops/Sec.
Dhrystones per sec: 348939
Memory Read 217.96 MB/Sec.
Memory Write 218.03 MB/Sec.
Disk Access 0.14 MB/Sec. Sequential, 50% Read/50% Write.
Note that all of the above tests run at the same time (not sequentially) and that higher results are better.
David
This confirmed that the standard tests in version 4.0 don't show great results with hyperthreading enabled and that the poor results are not because of a "bug" in PerformanceTest. We initially thought that the MaxMegaFLOPS test should give some good results with HT enabled. However there are some test scenarios in which HT does very badly in. The MaxMegaFLOPS test is one of those scenarios. These scenarios tend to be a slightly artificial however, because all the test threads are doing the same activity at the same time. HT works best if there is a variety of different activity happening simultaneous.
We have now created a more realistic multithreaded test.
Here is the link to the new (currently pre-release) version of PerfomanceTest (Version 4.0 build 1009):
(Link removed - 10/Dec/2003 - Final release version is now available. Download from: http://www.passmark.com/download/index.htm)
The two main changes in this version are the Advanced Multi-process testing window and the updated Advanced Disk Testing window. The Advanced Multi-process testing shows some significant improvement with HT (around +20%).
See the sample results below.
Multiple Process Test Results Hyper Threading OFF
CPU Type: Pentium 4 2.8Ghz
Memory Information: 510 MB RAM in 1 slots
Test Results
Prime Number Search 41538 Primes Found
Sorting Random Strings 107782 Thousand strings sorted per sec.
Compression 176 KBytes per sec.
Encryption 782.57 KBytes per sec.
Image Rotation 15.22 Rotations per second
MMX Addition 100.10 Million Ops/Sec.
MMX Multiplication 100.10 Million Ops/Sec.
Integer Addition 56.02 Million Ops/Sec.
Whetstone 89.53 Million Ops/Sec.
Dhrystone 159.86 Million Ops/Sec.
Dhrystones per sec: 280878
Memory Read 164.15 MB/Sec.
Memory Write 164.25 MB/Sec.
Disk Access 0.06 MB/Sec. Sequential, 50% Read/50% Write.
Multiple Process Test Results Hyper Threading ON
CPU Information: 2xPentium 4 (HT)
Memory Information: 510 MB RAM in 1 slots
Test Results
Prime Number Search 49098 Primes Found
Sorting Random Strings 138167 Thousand strings sorted per sec.
Compression 230 KBytes per sec.
Encryption 1053.25 KBytes per sec.
Image Rotation 17.61 Rotations per second
MMX Addition 127.25 Million Ops/Sec.
MMX Multiplication 127.34 Million Ops/Sec.
Integer Addition 57.59 Million Ops/Sec.
Whetstone 176.06 Million Ops/Sec.
Dhrystone 198.60 Million Ops/Sec.
Dhrystones per sec: 348939
Memory Read 217.96 MB/Sec.
Memory Write 218.03 MB/Sec.
Disk Access 0.14 MB/Sec. Sequential, 50% Read/50% Write.
Note that all of the above tests run at the same time (not sequentially) and that higher results are better.
David
Comment