Hello,
today i had a power outage for a few minutes, which was covered by the UPS. The system did a emergency shutdown. After booting the system i noticed a bunch of these errors in the syslog:
the system:
So, i had a few memtest runs - ECC errors only happening while Test #0 and Test #1.
The tests i did:
All modules:
Single Module:
The above errors are happening for all modules, regardless in which slot they are seated. I think the problem is not the RAM, maybe the CPU / Mainboard is fried.
Any ideas ?
today i had a power outage for a few minutes, which was covered by the UPS. The system did a emergency shutdown. After booting the system i noticed a bunch of these errors in the syslog:
Code:
Jun 6 08:11:20 prxsrv kernel: [ 0.386924] EDAC MC: Ver: 3.0.0 Jun 6 08:11:20 prxsrv kernel: [ 12.168935] EDAC MC0: Giving out device to module ie31200_edac controller IE31200: DEV 0000:00:00.0 (POLLED) Jun 6 08:11:22 prxsrv kernel: [ 15.218862] EDAC MC0: 1 UE ie31200 UE on mc#0csrow#2channel#0 (csrow:2 channel:0 page:0x0 offset:0x0 grain: Jun 6 08:11:22 prxsrv kernel: [ 15.218864] EDAC MC0: 1 UE ie31200 UE on mc#0csrow#2channel#1 (csrow:2 channel:1 page:0x0 offset:0x0 grain: Jun 6 08:11:26 prxsrv kernel: [ 19.314923] EDAC MC0: 1 UE ie31200 UE on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain: Jun 6 08:11:33 prxsrv kernel: [ 25.462965] EDAC MC0: 1 UE ie31200 UE on mc#0csrow#2channel#0 (csrow:2 channel:0 page:0x0 offset:0x0 grain: Jun 6 08:11:52 prxsrv kernel: [ 44.904811] EDAC MC0: 1 UE UE overwrote CE on any memory ( page:0x0 offset:0x0 grain:
Code:
[TABLE] [TR] [TD="class: value, width: 35%"]EFI Specifications[/TD] [TD="class: altvalue, width: 65%"]2.40[/TD] [/TR] [TR] [TD="class: value, width: 35%"]System[/TD] [TD="class: altvalue, width: 65%"] [/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Manufacturer[/TD] [TD="class: altvalue, width: 65%"]Supermicro[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Product Name[/TD] [TD="class: altvalue, width: 65%"]Super Server[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Version[/TD] [TD="class: altvalue, width: 65%"]0123456789[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Serial Number[/TD] [TD="class: altvalue, width: 65%"]0123456789[/TD] [/TR] [TR] [TD="class: value, width: 35%"]BIOS[/TD] [TD="class: altvalue, width: 65%"] [/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Vendor[/TD] [TD="class: altvalue, width: 65%"]American Megatrends Inc.[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Version[/TD] [TD="class: altvalue, width: 65%"]2.5[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Release Date[/TD] [TD="class: altvalue, width: 65%"]11/26/2020[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Baseboard[/TD] [TD="class: altvalue, width: 65%"] [/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Manufacturer[/TD] [TD="class: altvalue, width: 65%"]Supermicro[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Product Name[/TD] [TD="class: altvalue, width: 65%"]X11SSH-F[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Version[/TD] [TD="class: altvalue, width: 65%"]1.01[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]Serial Number[/TD] [TD="class: altvalue, width: 65%"]ZM17AS029357[/TD] [/TR] [TR] [TD="class: value, width: 35%"]CPU Type[/TD] [TD="class: altvalue, width: 65%"]Intel Xeon E3-1245 v6 @ 3.70GHz[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]CPU Clock[/TD] [TD="class: altvalue, width: 65%"]3697 MHz [Turbo: 3776.6 MHz][/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]# Logical Processors[/TD] [TD="class: altvalue, width: 65%"]8 (4 enabled for testing)[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]L1 Cache[/TD] [TD="class: altvalue, width: 65%"]4 x 64K (50607 MB/s)[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]L2 Cache[/TD] [TD="class: altvalue, width: 65%"]4 x 256K (22413 MB/s)[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]L3 Cache[/TD] [TD="class: altvalue, width: 65%"]8192K (13610 MB/s)[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Memory[/TD] [TD="class: altvalue, width: 65%"]65356M (8160 MB/s)[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Number of RAM SPDs detected[/TD] [TD="class: altvalue, width: 65%"]4[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]SPD #0[/TD] [TD="class: altvalue, width: 65%"]16GB DDR4 ECC PC4-21300[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]Kingston / 9965684-034.A00G / 03C41ABB[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]19-19-19-43 / 2666 MHz / 1.2V[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]SPD #1[/TD] [TD="class: altvalue, width: 65%"]16GB DDR4 ECC PC4-21300[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]Kingston / 9965684-034.A00G / F6841ABA[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]19-19-19-43 / 2666 MHz / 1.2V[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]SPD #2[/TD] [TD="class: altvalue, width: 65%"]16GB DDR4 ECC PC4-21300[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]Kingston / 9965684-034.A00G / F6841088[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]19-19-19-43 / 2666 MHz / 1.2V[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]SPD #3[/TD] [TD="class: altvalue, width: 65%"]16GB DDR4 ECC PC4-21300[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]Kingston / 9965684-034.A00G / EB84198F[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]19-19-19-43 / 2666 MHz / 1.2V[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Number of RAM slots[/TD] [TD="class: altvalue, width: 65%"]4[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Number of RAM modules[/TD] [TD="class: altvalue, width: 65%"]4[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]DIMM Slot #0[/TD] [TD="class: altvalue, width: 65%"]16GB DDR4 ECC PC4-21300[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]Kingston / 9965684-034.A00G / 03C41ABB[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]2667 MHz[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]DIMM Slot #1[/TD] [TD="class: altvalue, width: 65%"]16GB DDR4 ECC PC4-21300[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]Kingston / 9965684-034.A00G / F6841ABA[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]2667 MHz[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]DIMM Slot #2[/TD] [TD="class: altvalue, width: 65%"]16GB DDR4 ECC PC4-21300[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]Kingston / 9965684-034.A00G / F6841088[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]2667 MHz[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"]DIMM Slot #3[/TD] [TD="class: altvalue, width: 65%"]16GB DDR4 ECC PC4-21300[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]Kingston / 9965684-034.A00G / EB84198F[/TD] [/TR] [TR] [TD="class: subvalue, width: 35%"] [/TD] [TD="class: altvalue, width: 65%"]2667 MHz[/TD] [/TR] [/TABLE]
The tests i did:
- all 4 ram modules
- only 2 ram modules in dual channel configuration (first A, then B)
- only one ram module, tried all ram slots
All modules:
Code:
[B]Result summary[/B] [TABLE] [TR] [TD="class: value, width: 35%"]Test Start Time[/TD] [TD="class: altvalue, width: 65%"]2021-06-06 13:13:16[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Elapsed Time[/TD] [TD="class: altvalue, width: 65%"]2:46:42[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Memory Range Tested[/TD] [TD="class: altvalue, width: 65%"]0x0 - 1075800000 (67416MB)[/TD] [/TR] [TR] [TD="class: value, width: 35%"]CPU Selection Mode[/TD] [TD="class: altvalue, width: 65%"]Parallel (All CPUs)[/TD] [/TR] [TR] [TD="class: value, width: 35%"]CPU Temperature Min/Max/Ave[/TD] [TD="class: altvalue, width: 65%"]31C/36C/34C[/TD] [/TR] [TR] [TD="class: value, width: 35%"]RAM Temperature Min/Max/Ave[/TD] [TD="class: altvalue, width: 65%"]52C/62C/57C[/TD] [/TR] [TR] [TD="class: value, width: 35%"]ECC Polling[/TD] [TD="class: altvalue, width: 65%"]Enabled[/TD] [/TR] [TR] [TD="class: value, width: 35%"]# Tests Passed[/TD] [TD="class: PASS, width: 65%"]11/11 (100%)[/TD] [/TR] [/TABLE] [TABLE] [TR] [TD="class: value, width: 35%"]ECC Correctable Errors[/TD] [TD="class: altvalue, width: 65%"]66[/TD] [/TR] [TR] [TD="class: value, width: 35%"]ECC Uncorrectable Errors[/TD] [TD="class: altvalue, width: 65%"]0[/TD] [/TR] [/TABLE] [TABLE] [TR] [TD="class: header, width: 60%"]Test[/TD] [TD="class: header, width: 20%"]# Tests Passed[/TD] [TD="class: header, width: 20%"]Errors[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 0 [Address test, walking ones, 1 CPU][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 1 [Address test, own address, 1 CPU][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 2 [Address test, own address][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 3 [Moving inversions, ones & zeroes][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 4 [Moving inversions, 8-bit pattern][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 5 [Moving inversions, random pattern][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 6 [Block move, 64-byte blocks][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 7 [Moving inversions, 32-bit pattern][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 8 [Random number sequence][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 9 [Modulo 20, ones & zeros][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 10 [Bit fade test, 2 patterns, 1 CPU][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 13 [Hammer test][/TD] [TD="class: altvalue, width: 20%"]0/0 (0%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [/TABLE] [TABLE] [TR] [TD="class: header"]Last 10 Errors[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1FC00,0), ECC Corrected: Yes, Syndrome: 00FF, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1FC00,8), ECC Corrected: Yes, Syndrome: 0077, Channel/Slot: 0/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1F280,8), ECC Corrected: Yes, Syndrome: 00AC, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1F280,C), ECC Corrected: Yes, Syndrome: 00DB, Channel/Slot: 0/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1E900,8), ECC Corrected: Yes, Syndrome: 00D5, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1E900,8), ECC Corrected: Yes, Syndrome: 0012, Channel/Slot: 0/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1DF80,0), ECC Corrected: Yes, Syndrome: 00E2, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1DF80,8), ECC Corrected: Yes, Syndrome: 00CF, Channel/Slot: 0/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1D600,8), ECC Corrected: Yes, Syndrome: 0041, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (0,0,1D600,C), ECC Corrected: Yes, Syndrome: 00C5, Channel/Slot: 0/0[/TD] [/TR] [/TABLE]
Code:
[B]Result summary[/B] [TABLE] [TR] [TD="class: value, width: 35%"]Test Start Time[/TD] [TD="class: altvalue, width: 65%"]2021-06-06 11:19:31[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Elapsed Time[/TD] [TD="class: altvalue, width: 65%"]0:01:01[/TD] [/TR] [TR] [TD="class: value, width: 35%"]Memory Range Tested[/TD] [TD="class: altvalue, width: 65%"]0x0 - 475800000 (18264MB)[/TD] [/TR] [TR] [TD="class: value, width: 35%"]CPU Selection Mode[/TD] [TD="class: altvalue, width: 65%"]Parallel (All CPUs)[/TD] [/TR] [TR] [TD="class: value, width: 35%"]CPU Temperature Min/Max/Ave[/TD] [TD="class: altvalue, width: 65%"]30C/30C/30C[/TD] [/TR] [TR] [TD="class: value, width: 35%"]RAM Temperature Min/Max/Ave[/TD] [TD="class: altvalue, width: 65%"]50C/50C/50C[/TD] [/TR] [TR] [TD="class: value, width: 35%"]ECC Polling[/TD] [TD="class: altvalue, width: 65%"]Enabled[/TD] [/TR] [TR] [TD="class: value, width: 35%"]# Tests Passed[/TD] [TD="class: PASS, width: 65%"]4/4 (100%)[/TD] [/TR] [/TABLE] [TABLE] [TR] [TD="class: value, width: 35%"]ECC Correctable Errors[/TD] [TD="class: altvalue, width: 65%"]10[/TD] [/TR] [TR] [TD="class: value, width: 35%"]ECC Uncorrectable Errors[/TD] [TD="class: altvalue, width: 65%"]0[/TD] [/TR] [/TABLE] [TABLE] [TR] [TD="class: header, width: 60%"]Test[/TD] [TD="class: header, width: 20%"]# Tests Passed[/TD] [TD="class: header, width: 20%"]Errors[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 0 [Address test, walking ones, 1 CPU][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 1 [Address test, own address, 1 CPU][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 2 [Address test, own address][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [TR] [TD="class: value, width: 60%"]Test 3 [Moving inversions, ones & zeroes][/TD] [TD="class: altvalue, width: 20%"]1/1 (100%)[/TD] [TD="class: altvalue, width: 20%"]0[/TD] [/TR] [/TABLE] [TABLE] [TR] [TD="class: header"]Last 10 Errors[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,1F200,18), ECC Corrected: Yes, Syndrome: 0063, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,1CC00,8), ECC Corrected: Yes, Syndrome: 00DD, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,1A600,10), ECC Corrected: Yes, Syndrome: 00FF, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,18000,8), ECC Corrected: Yes, Syndrome: 00F9, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,15A00,8), ECC Corrected: Yes, Syndrome: 007F, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,13400,10), ECC Corrected: Yes, Syndrome: 009E, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,10E00,8), ECC Corrected: Yes, Syndrome: 00E5, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,10000,8), ECC Corrected: Yes, Syndrome: 003C, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 1, (Rank,Bank,Row,Col): (2,0,1BA00,0), ECC Corrected: Yes, Syndrome: 0050, Channel/Slot: 1/0[/TD] [/TR] [TR] [TD="class: value"][ECC Error] Test: 0, (Rank,Bank,Row,Col): (2,0,10000,0), ECC Corrected: Yes, Syndrome: 00CE, Channel/Slot: 1/0[/TD] [/TR] [/TABLE]
Any ideas ?
Comment