Announcement

Collapse
No announcement yet.

Few errors but always on same CPU

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Few errors but always on same CPU

    Hi,

    Due to a few weird crashing on a virtual machine this week, I decided to run a quick Memtest to check if everything was fine. It did pick up a few errors, but what seems weird, is that they all came from tests running on CPU core #22 (I'm using a Ryzen 3900X, so 24 active cores available).

    I ran another test the following night, same thing : just one error, but one the same CPU core. What are the odds that 4 errors would pop up on the same CPU 2 times in a row ? Would this rather indicate that the CPU is deficient ?

    Thanks for your input.

    Logs of the first run :

    System Information
    EFI Specifications 2.70
    System
    Manufacturer Gigabyte Technology Co., Ltd.
    Product Name X570 AORUS ELITE
    Version -CF
    Serial Number Default string
    BIOS
    Vendor American Megatrends Inc.
    Version F12f
    Release Date 03/06/2020
    Baseboard
    Manufacturer Gigabyte Technology Co., Ltd.
    Product Name X570 AORUS ELITE
    Version x.x
    Serial Number Default string
    CPU Type AMD Ryzen 9 3900X 12-Core
    CPU Clock 3793 MHz [Turbo: 3793.3 MHz]
    # Logical Processors 24
    L1 Cache 24 x 64K (210902 MB/s)
    L2 Cache 24 x 512K (78362 MB/s)
    L3 Cache 1 x 65536K (19016 MB/s)
    Memory 32733M (18115 MB/s)
    DIMM Slot #0 16GB DDR4 XMP PC4-28800
    G Skill Intl / F4-3600C16-16GVKC
    16-19-19-39 / 3602 MHz / 1.350V
    DIMM Slot #1 16GB DDR4 XMP PC4-28800
    G Skill Intl / F4-3600C16-16GVKC
    16-19-19-39 / 3602 MHz / 1.350V
    Result summary
    Test Start Time 2020-05-13 00:42:20
    Elapsed Time 8:17:39
    Memory Range Tested 0x0 - 81F300000 (33267MB)
    CPU Selection Mode Parallel (All CPUs)
    ECC Polling Enabled
    # Tests Passed 46/48 (95%)
    Lowest Error Address 0x6F5471D50 (28500MB)
    Highest Error Address 0x6F7471E70 (28532MB)
    Bits in Error Mask 0000000000004000
    Bits in Error 1
    Max Contiguous Errors 1
    Test # Tests Passed Errors
    Test 0 [Address test, walking ones, 1 CPU] 4/4 (100%) 0
    Test 1 [Address test, own address, 1 CPU] 4/4 (100%) 0
    Test 2 [Address test, own address] 4/4 (100%) 0
    Test 3 [Moving inversions, ones & zeroes] 4/4 (100%) 0
    Test 4 [Moving inversions, 8-bit pattern] 4/4 (100%) 0
    Test 5 [Moving inversions, random pattern] 3/4 (75%) 1
    Test 6 [Block move, 64-byte blocks] 3/4 (75%) 2
    Test 7 [Moving inversions, 32-bit pattern] 4/4 (100%) 0
    Test 8 [Random number sequence] 4/4 (100%) 0
    Test 9 [Modulo 20, ones & zeros] 4/4 (100%) 0
    Test 10 [Bit fade test, 2 patterns, 1 CPU] 4/4 (100%) 0
    Test 13 [Hammer test] 4/4 (100%) 0
    Last 10 Errors
    2020-05-13 07:01:05 - [Data Error] Test: 6, CPU: 22, Address: 6F7471E70, Expected: 00001000, Actual: 00005000
    2020-05-13 07:01:05 - [Data Error] Test: 6, CPU: 22, Address: 6F5471E90, Expected: 00001000, Actual: 00005000
    2020-05-13 04:46:24 - [Data Error] Test: 5, CPU: 22, Address: 6F5471D50, Expected: ED112FAD, Actual: ED116FAD
    Logs of the second run :

    System Information
    EFI Specifications 2.70
    System
    Manufacturer Gigabyte Technology Co., Ltd.
    Product Name X570 AORUS ELITE
    Version -CF
    Serial Number Default string
    BIOS
    Vendor American Megatrends Inc.
    Version F12f
    Release Date 03/06/2020
    Baseboard
    Manufacturer Gigabyte Technology Co., Ltd.
    Product Name X570 AORUS ELITE
    Version x.x
    Serial Number Default string
    CPU Type AMD Ryzen 9 3900X 12-Core
    CPU Clock 3793 MHz [Turbo: 3793.4 MHz]
    # Logical Processors 24
    L1 Cache 24 x 64K (205025 MB/s)
    L2 Cache 24 x 512K (79142 MB/s)
    L3 Cache 1 x 65536K (19265 MB/s)
    Memory 32733M (18376 MB/s)
    DIMM Slot #0 16GB DDR4 XMP PC4-28800
    G Skill Intl / F4-3600C16-16GVKC
    16-19-19-39 / 3602 MHz / 1.350V
    DIMM Slot #1 16GB DDR4 XMP PC4-28800
    G Skill Intl / F4-3600C16-16GVKC
    16-19-19-39 / 3602 MHz / 1.350V
    Result summary
    Test Start Time 2020-05-14 00:13:50
    Elapsed Time 8:11:39
    Memory Range Tested 0x0 - 81F300000 (33267MB)
    CPU Selection Mode Parallel (All CPUs)
    ECC Polling Enabled
    # Tests Passed 47/48 (97%)
    Lowest Error Address 0x6F5471D50 (28500MB)
    Highest Error Address 0x6F5471D50 (28500MB)
    Bits in Error Mask 0000000000004000
    Bits in Error 1
    Max Contiguous Errors 1
    Test # Tests Passed Errors
    Test 0 [Address test, walking ones, 1 CPU] 4/4 (100%) 0
    Test 1 [Address test, own address, 1 CPU] 4/4 (100%) 0
    Test 2 [Address test, own address] 4/4 (100%) 0
    Test 3 [Moving inversions, ones & zeroes] 4/4 (100%) 0
    Test 4 [Moving inversions, 8-bit pattern] 4/4 (100%) 0
    Test 5 [Moving inversions, random pattern] 4/4 (100%) 0
    Test 6 [Block move, 64-byte blocks] 4/4 (100%) 0
    Test 7 [Moving inversions, 32-bit pattern] 4/4 (100%) 0
    Test 8 [Random number sequence] 3/4 (75%) 1
    Test 9 [Modulo 20, ones & zeros] 4/4 (100%) 0
    Test 10 [Bit fade test, 2 patterns, 1 CPU] 4/4 (100%) 0
    Test 13 [Hammer test] 4/4 (100%) 0
    Last 10 Errors
    2020-05-14 06:42:55 - [Data Error] Test: 8, CPU: 22, Address: 6F5471D50, Expected: 663DBEF6, Actual: 663DFEF6

  • #2
    More likely it is bad RAM.

    In the current V8.4 release of Memtest86 each core tests memory blocks of max 64MB at a time.
    | Core 0 - 64MB | Core 1 - 64MB | ..... | Core N - 64 MB |

    So there is a fixed relationship between cores and addresses. So if you have a single bad memory address, it is no surprise that it pops up on the same core each time.
    We have discussed this a bit internally, and likely we'll switch to some pseudo random allocation of cores in a future software release.

    Comment


    • #3
      Thanks for your input. Just noticed too that address of the error on the second run matches the address of one of the errors on the first one.

      Comment

      Working...
      X