Hi again, some more questions here.
Not directly relating to problems or crashes, but I'm looking for some help in understanding.
So no urgent prolem solving needed.
But I'd be happy if someone found the time to share a bit of knowledge.
System is:
Asus Z9PE-D8 WS: dual-Socket, quad-channel, Intel C602 chipset mainbard
4×/8× Samsung M386B4G70DM0-CMA4: DDR3, 1866MHz, Reg-ECC, 32GB DIMMs
2× Intel Xeon E5-2690v2: 10c/20t 3.00GHz CPUs
First thing is the recognition of the module specs.
I don't know how the SPDs or SMBIOSes of these DIMMs are programmed, but I wonder why there are many informations or features n/a for Memtest.
RAM Info is as follows:
First subtopic of interest is this entry under SPD Details:
I am pretty sure these are Registered DIMMs, as the capacity of 32GB per DDR3-DIMM could only be manufactured as Registered DIMMs ...right?
How come that Memtest does not recognize this?
Does this have any offect on testing?
I have run dozens of complete test suites already, but I don't know how registered oder unbufferd state would relfect in the logs.
Second is the Infos in section SMBIOS Details:
There are a lot of informations not available.
Is this common for RegECC-DIMMS?
Is it of any effect that those infos can't be read, or aren't programmed at all?
Another topic is the performance of the tests relating to the NUMA-setting of the board's BIOS setup.
While scrolling through some logs of passed tests, I noticed repeated Entries like this:
Sometimes those warnings could sum up to 20~30 per test.
These Warnings had no effect on the number of reported errors.
The test runs were alwasy marked as "passed".
After some trying around I found out that these warnings don't appear when I set the NUMA option in BIOS setup to "disabled".
Also the complete test suites (4 runs of 13) finish notably faster.
A complete 4×13 test run finished in 47:53:47hrs when using 4 DIMMs for 2 CPUs (dual channel config).
Without NUMA, the 4×13 test run finished in 38:00:48hrs.
There are notably more warnings when using single channel setup (1 DIMM each CPU) than dual channel setup.
I have some glimpse idea how NUMA memory access works, but no detailed info about how it's organized in IvyBridge-EP systems.
Does that non uniform memory access apply for internal access paths within one CPU or does it apply for access of cores "cross-CPU", like from cores of CPU1 to memory managed by CPU2?
Why do these delays occur when NUMA is activated?
In which cases is it useful to enable NUMA Mode, what kind of task does profit from it?
When is it useful to disable it?
Thanks for reading ^ ^
Not directly relating to problems or crashes, but I'm looking for some help in understanding.
So no urgent prolem solving needed.
But I'd be happy if someone found the time to share a bit of knowledge.
System is:
Asus Z9PE-D8 WS: dual-Socket, quad-channel, Intel C602 chipset mainbard
4×/8× Samsung M386B4G70DM0-CMA4: DDR3, 1866MHz, Reg-ECC, 32GB DIMMs
2× Intel Xeon E5-2690v2: 10c/20t 3.00GHz CPUs
First thing is the recognition of the module specs.
I don't know how the SPDs or SMBIOSes of these DIMMs are programmed, but I wonder why there are many informations or features n/a for Memtest.
RAM Info is as follows:
Memory summary:
Number of RAM slots: 8
Number of RAM modules: 8
Number of RAM SPDs detected: 8
Total Physical Memory: 262114M
SPD Details:
--------------
SPD #: 1
==============
RAM Type: DDR3
Maximum Clock Speed (MHz): 933 (JEDEC)
Maximum Transfer Speed (MHz): DDR3-1867
Maximum Bandwidth (MB/s): PC3-14900
Memory Capacity (MB): 32768
Jedec Manufacture Name: Samsung
SPD Revision: 1.2
Registered: No
ECC: Yes
DIMM Slot #: 1
Manufactured: Week 40 of Year 2014
Module Part #: M386B4G70DM0-CMA4
Module Revision: 0x0000
Module Serial #: 0x394A4D3B
Module Manufacturing Location: 0x02
# of Row Addressing Bits: 16
# of Column Addressing Bits: 11
# of Banks: 8
# of Ranks: 4
Device Width in Bits: 4
Bus Width in Bits: 64
Module Voltage: 1.5V
CAS Latencies Supported: 6 7 8 9 10 11 13
Timings @ Max Frequency (JEDEC): 13-13-13-32
Maximum Clock Speed (MHz): 933
Maximum Transfer Speed (MHz): DDR3-1867
Maximum Bandwidth (MB/s): PC3-14900
Minimum Clock Cycle Time, tCK (ns): 1.071
Minimum CAS Latency Time, tAA (ns): 13.125
Minimum RAS to CAS Delay, tRCD (ns): 13.125
Minimum Row Precharge Time, tRP (ns): 13.125
Minimum Active to Precharge Time, tRAS (ns): 34.000
Minimum Row Active to Row Active Delay, tRRD (ns): 5.000
Minimum Auto-Refresh to Active/Auto-Refresh Time, tRC (ns): 47.125
Minimum Auto-Refresh to Active/Auto-Refresh Command Period, tRFC (ns): 260.000
DDR3 Specific SPD Attributes
Write Recover Time, tWR (ns): 15.000
Internal Write to Read Command Delay, tWTR (ns): 7.500
Internal Read to Precharge Command Delay, tRTP (ns): 7.500
Minimum Four Activate Window Delay (ns): 27.000
RZQ / 6 Supported: Yes
RZQ / 7 Supported: Yes
DLL-Off Mode Supported: Yes
Maximum Operating Temperature Range (C): 0-95C
Refresh Rate at Extended Operating Temperature Range: 2X
Auto-Self Refresh Supported: No
On-die Thermal Sensor Readout Supported: No
Partial Array Self Refresh Supported: No
Thermal Sensor Present: Yes
Non-standard SDRAM Type: 00
Module Type: Reserved
Module Height (mm): -1 - 0
Module Thickness (mm): front -1-0 , back -1-0
Module Width (mm):
Reference Raw Card Used:
DRAM Manufacture: Samsung
SMBIOS Details:
--------------
DIMM #: 1
==============
Total Width: 72 bits
Data Width: 64 bits
Size: 32768 MB
Form Factor: DIMM
Device Set: 0
Device Locator: DIMM_A1
Bank Locator: DIMM_A1
Memory Type: DDR3
Type Detail: Synchronous
Speed: 1866 MT/s
Manufacturer: Samsung
Serial Number: 394A4D3B
Asset Tag: DIMM_A1_AssetTag
Part Number: M386B4G70DM0-CM
Attributes: 00000004
Configured Memory Speed: 1866 MT/s
Minimum Voltage: N/A
Maximum Voltage: N/A
Configured Voltage: N/A
Memory Technology: Unknown
Memory Operating Mode Capability: Unknown
Firmware Version:
Module Manufacturer ID: N/A
Module Product ID: N/A
Memory Subsystem Controller Manufacturer ID: N/A
Memory Subsystem Controller Product ID: N/A
Non Volatile Size: N/A
Volatile Size: N/A
Cache Size: N/A
Logical Size: N/A
Number of RAM slots: 8
Number of RAM modules: 8
Number of RAM SPDs detected: 8
Total Physical Memory: 262114M
SPD Details:
--------------
SPD #: 1
==============
RAM Type: DDR3
Maximum Clock Speed (MHz): 933 (JEDEC)
Maximum Transfer Speed (MHz): DDR3-1867
Maximum Bandwidth (MB/s): PC3-14900
Memory Capacity (MB): 32768
Jedec Manufacture Name: Samsung
SPD Revision: 1.2
Registered: No
ECC: Yes
DIMM Slot #: 1
Manufactured: Week 40 of Year 2014
Module Part #: M386B4G70DM0-CMA4
Module Revision: 0x0000
Module Serial #: 0x394A4D3B
Module Manufacturing Location: 0x02
# of Row Addressing Bits: 16
# of Column Addressing Bits: 11
# of Banks: 8
# of Ranks: 4
Device Width in Bits: 4
Bus Width in Bits: 64
Module Voltage: 1.5V
CAS Latencies Supported: 6 7 8 9 10 11 13
Timings @ Max Frequency (JEDEC): 13-13-13-32
Maximum Clock Speed (MHz): 933
Maximum Transfer Speed (MHz): DDR3-1867
Maximum Bandwidth (MB/s): PC3-14900
Minimum Clock Cycle Time, tCK (ns): 1.071
Minimum CAS Latency Time, tAA (ns): 13.125
Minimum RAS to CAS Delay, tRCD (ns): 13.125
Minimum Row Precharge Time, tRP (ns): 13.125
Minimum Active to Precharge Time, tRAS (ns): 34.000
Minimum Row Active to Row Active Delay, tRRD (ns): 5.000
Minimum Auto-Refresh to Active/Auto-Refresh Time, tRC (ns): 47.125
Minimum Auto-Refresh to Active/Auto-Refresh Command Period, tRFC (ns): 260.000
DDR3 Specific SPD Attributes
Write Recover Time, tWR (ns): 15.000
Internal Write to Read Command Delay, tWTR (ns): 7.500
Internal Read to Precharge Command Delay, tRTP (ns): 7.500
Minimum Four Activate Window Delay (ns): 27.000
RZQ / 6 Supported: Yes
RZQ / 7 Supported: Yes
DLL-Off Mode Supported: Yes
Maximum Operating Temperature Range (C): 0-95C
Refresh Rate at Extended Operating Temperature Range: 2X
Auto-Self Refresh Supported: No
On-die Thermal Sensor Readout Supported: No
Partial Array Self Refresh Supported: No
Thermal Sensor Present: Yes
Non-standard SDRAM Type: 00
Module Type: Reserved
Module Height (mm): -1 - 0
Module Thickness (mm): front -1-0 , back -1-0
Module Width (mm):
Reference Raw Card Used:
DRAM Manufacture: Samsung
SMBIOS Details:
--------------
DIMM #: 1
==============
Total Width: 72 bits
Data Width: 64 bits
Size: 32768 MB
Form Factor: DIMM
Device Set: 0
Device Locator: DIMM_A1
Bank Locator: DIMM_A1
Memory Type: DDR3
Type Detail: Synchronous
Speed: 1866 MT/s
Manufacturer: Samsung
Serial Number: 394A4D3B
Asset Tag: DIMM_A1_AssetTag
Part Number: M386B4G70DM0-CM
Attributes: 00000004
Configured Memory Speed: 1866 MT/s
Minimum Voltage: N/A
Maximum Voltage: N/A
Configured Voltage: N/A
Memory Technology: Unknown
Memory Operating Mode Capability: Unknown
Firmware Version:
Module Manufacturer ID: N/A
Module Product ID: N/A
Memory Subsystem Controller Manufacturer ID: N/A
Memory Subsystem Controller Product ID: N/A
Non Volatile Size: N/A
Volatile Size: N/A
Cache Size: N/A
Logical Size: N/A
First subtopic of interest is this entry under SPD Details:
Registered: No
I am pretty sure these are Registered DIMMs, as the capacity of 32GB per DDR3-DIMM could only be manufactured as Registered DIMMs ...right?
How come that Memtest does not recognize this?
Does this have any offect on testing?
I have run dozens of complete test suites already, but I don't know how registered oder unbufferd state would relfect in the logs.
Second is the Infos in section SMBIOS Details:
There are a lot of informations not available.
Is this common for RegECC-DIMMS?
Is it of any effect that those infos can't be read, or aren't programmed at all?
Another topic is the performance of the tests relating to the NUMA-setting of the board's BIOS setup.
While scrolling through some logs of passed tests, I noticed repeated Entries like this:
2021-12-26 07:38:03 - Running test #7 (Test 7 [Moving inversions, 32-bit pattern])
2021-12-26 07:38:03 - MtSupportRunAllTests - Setting random seed to 0x3BC48C3E
2021-12-26 07:38:03 - MtSupportRunAllTests - Start time: 81241323 ms
2021-12-26 07:38:03 - MtSupportRunAllTests - Enabling memory cache for test
2021-12-26 07:38:03 - MtSupportRunAllTests - Enabling memory cache complete
2021-12-26 07:38:03 - Start memory range test (0x0 - 0x20C0000000)
2021-12-26 07:38:35 - GetIA32ArchitecturalTemp - MSR(0x19C) = 88370000 (Vendor ID: GenuineIntel 6 3E 4)
2021-12-26 07:38:35 - MapTempIntel - MSR(0x1A2) = 640E00
[...]
2021-12-26 08:34:37 - WARNING - waited for 10s for CPU #32 to finish (BSP test time = 22814ms)
2021-12-26 08:34:37 - WARNING - waited for 10s for CPU #34 to finish (BSP test time = 22814ms)
2021-12-26 08:34:37 - WARNING - waited for 10s for CPU #36 to finish (BSP test time = 22814ms)
2021-12-26 08:34:37 - WARNING - waited for 10s for CPU #38 to finish (BSP test time = 22814ms)
2021-12-26 07:38:03 - MtSupportRunAllTests - Setting random seed to 0x3BC48C3E
2021-12-26 07:38:03 - MtSupportRunAllTests - Start time: 81241323 ms
2021-12-26 07:38:03 - MtSupportRunAllTests - Enabling memory cache for test
2021-12-26 07:38:03 - MtSupportRunAllTests - Enabling memory cache complete
2021-12-26 07:38:03 - Start memory range test (0x0 - 0x20C0000000)
2021-12-26 07:38:35 - GetIA32ArchitecturalTemp - MSR(0x19C) = 88370000 (Vendor ID: GenuineIntel 6 3E 4)
2021-12-26 07:38:35 - MapTempIntel - MSR(0x1A2) = 640E00
[...]
2021-12-26 08:34:37 - WARNING - waited for 10s for CPU #32 to finish (BSP test time = 22814ms)
2021-12-26 08:34:37 - WARNING - waited for 10s for CPU #34 to finish (BSP test time = 22814ms)
2021-12-26 08:34:37 - WARNING - waited for 10s for CPU #36 to finish (BSP test time = 22814ms)
2021-12-26 08:34:37 - WARNING - waited for 10s for CPU #38 to finish (BSP test time = 22814ms)
Sometimes those warnings could sum up to 20~30 per test.
These Warnings had no effect on the number of reported errors.
The test runs were alwasy marked as "passed".
After some trying around I found out that these warnings don't appear when I set the NUMA option in BIOS setup to "disabled".
Also the complete test suites (4 runs of 13) finish notably faster.
A complete 4×13 test run finished in 47:53:47hrs when using 4 DIMMs for 2 CPUs (dual channel config).
Without NUMA, the 4×13 test run finished in 38:00:48hrs.
There are notably more warnings when using single channel setup (1 DIMM each CPU) than dual channel setup.
I have some glimpse idea how NUMA memory access works, but no detailed info about how it's organized in IvyBridge-EP systems.
Does that non uniform memory access apply for internal access paths within one CPU or does it apply for access of cores "cross-CPU", like from cores of CPU1 to memory managed by CPU2?
Why do these delays occur when NUMA is activated?
In which cases is it useful to enable NUMA Mode, what kind of task does profit from it?
When is it useful to disable it?
Thanks for reading ^ ^
Comment