Announcement

Collapse
No announcement yet.

BurninTest 5.3 build 1036 serial receive overrun

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BurninTest 5.3 build 1036 serial receive overrun

    Hello,

    We are using Passmark Burnin Test Pro V5.3 1026 and recently upgraded to build 1036. For months now we have been plagued with receive over-run errors (with 12-hours of stress) on the serial ports with some of the newer product lines. Upgrading to build 1036 causes faster failures. In performing validation using the portmon utility from Sysinternals I found that for some reason there are 3 successive 100-byte writes, then three 100 byte reads. The serial ports are provided by a Winbond (now Nuvoton) LPC SuperIO controller that has 16-byte FIFOs.

    With the older version, the testers had been rebooting Windows, then running the test again.

    We are seeing failures more often using Windows 2000 SP4. The information on one of these test units as reported on the report follows (build 1036 fails to report the hard drives):
    Operating system: Windows 2000 Professional Service Pack 4 build 2195
    Number of CPUs: 1 (2 Core(s)/CPU, 1 Logical(s)/Core)
    CPU manufacturer: GenuineIntel
    CPU type: Intel(R) Core(TM)2 CPU T5300 @ 1.73GHz
    CPU features: MMX SSE SSE2 SSE3 DEP PAE
    CPU1 speed: 1728.6 MHz
    CPU L2 Cache: 2 MB
    RAM: 1014 MB
    Video card: Mobile Intel(R) 945 Express Chipset Family (Resolution: 800x600x32)

    Example tests for a 12-hour duration follow:
    Test Start time: Wed Jul 15 16:10:04 2009
    Test Stop time: Thu Jul 16 04:10:08 2009
    Test Duration: 012h 00m 04s

    Test Name Cycles Operations Result Errors Last Error
    CPU - Maths 18231 1.995 Trillion PASS 0 No errors
    CPU - SIMD 3723 2.530 Trillion PASS 0 No errors
    Memory (RAM) 80 120 Billion PASS 0 No errors
    Disk (C: ) 40 116 Billion PASS 0 No errors
    Disk (E: ) 310 115 Billion PASS 0 No errors
    CD/DVD (D 98 143 Billion PASS 0 No errors
    Network 1 285 2.284 Million PASS 0 No errors
    Parallel Port 184 55.252 Million PASS 0 No errors
    Video Playback 7160 31987 PASS 0 No errors
    Serial Port 1 996 57.414 Million PASS 0 No errors
    Serial Port 2 996 57.421 Million PASS 0 No errors

  • #2
    This issue is still with us on 5-different systems. In analyzing the serial data stream (which I will attach; note the items in Bold), I see two bytes are lost in the middle of the third series of port writes.

    Again, this issue appears to be magnified with the last update to BIT.

    1857477 11:58:53 PM bit.exe IRP_MJ_WRITE Serial1 SUCCESS Length 100: 0E 3A 05 6A 7D 91 6A DB 25 7F EB 98 C2 E7 34 E9 DE C9 CD CB E7 62 94 AF 44 9E 33 0E E0 6B F4 02 37 49 6E DC 59 1C 45 6B 00 2F 8D 56 27 D9 95 65 2C BC 15 7A 8E 0E 01 A0 35 61 19 F6 D8 E8 61 D7 82 19 17 07 B6 AF DB 3A B5 83 A8 60 75 A1 B7 5F 40 94 04 36 05 EA 86 4D 5A 4A 4C C2 47 A7 9D 10 37 78 B4 64
    1857479 11:58:53 PM bit.exe IRP_MJ_WRITE Serial1 SUCCESS Length 100: 0E 3A 05 6A 7D 91 6A DB 25 7F EB 98 C2 E7 34 E9 DE C9 CD CB E7 62 94 AF 44 9E 33 0E E0 6B F4 02 37 49 6E DC 59 1C 45 6B 00 2F 8D 56 27 D9 95 65 2C BC 15 7A 8E 0E 01 A0 35 61 19 F6 D8 E8 61 D7 82 19 17 07 B6 AF DB 3A B5 83 A8 60 75 A1 B7 5F 40 94 04 36 05 EA 86 4D 5A 4A 4C C2 47 A7 9D 10 37 78 B4 64
    1857481 11:58:53 PM bit.exe IRP_MJ_WRITE Serial1 SUCCESS Length 100: 0E 3A 05 6A 7D 91 6A DB 25 7F EB 98 C2 E7 34 E9 DE C9 CD CB E7 62 94 AF 44 9E 33 0E E0 6B F4 02 37 49 6E DC 59 1C 45 6B 00 2F 8D 56 27 D9 95 65 2C BC 15 7A 8E 0E 01 A0 35 61 19 F6 D8 E8 61 D7 82 19 17 07 B6 AF DB 3A B5 83 A8 60 75 A1 B7 5F 40 94 04 36 05 EA 86 4D 5A 4A 4C C2 47 A7 9D 10 37 78 B4 64
    1857494 11:58:53 PM bit.exe IRP_MJ_READ Serial1 SUCCESS Length 100: 0E 3A 05 6A 7D 91 6A DB 25 7F EB 98 C2 E7 34 E9 DE C9 CD CB E7 62 94 AF 44 9E 33 0E E0 6B F4 02 37 49 6E DC 59 1C 45 6B 00 2F 8D 56 27 D9 95 65 2C BC 15 7A 8E 0E 01 A0 35 61 19 F6 D8 E8 61 D7 82 19 17 07 B6 AF DB 3A B5 83 A8 60 75 A1 B7 5F 40 94 04 36 05 EA 86 4D 5A 4A 4C C2 47 A7 9D 10 37 78 B4 64
    1857495 11:58:53 PM bit.exe IRP_MJ_READ Serial1 SUCCESS Length 100: 0E 3A 05 6A 7D 91 6A DB 25 7F EB 98 C2 E7 34 E9 DE C9 CD CB E7 62 94 AF 44 9E 33 0E E0 6B F4 02 37 49 6E DC 59 1C 45 6B 00 2F 8D 56 27 D9 95 65 2C BC 15 7A 8E 0E 01 A0 35 61 19 F6 D8 E8 61 D7 82 19 17 07 B6 AF DB 3A B5 83 A8 60 75 A1 B7 5F 40 94 04 36 05 EA 86 4D 5A 4A 4C C2 47 A7 9D 10 37 78 B4 64
    1857496 11:58:53 PM bit.exe IRP_MJ_READ Serial1 TIMEOUT Length 98: 0E 3A 05 6A 7D 91 6A DB 25 7F EB 98 C2 E7 34 E9 DE C9 CD CB E7 62 94 AF 44 9E 33 0E E0 6B F4 02 37 49 6E DC 59 1C 45 6B 8D 56 27 D9 95 65 2C BC 15 7A 8E 0E 01 A0 35 61 19 F6 D8 E8 61 D7 82 19 17 07 B6 AF DB 3A B5 83 A8 60 75 A1 B7 5F 40 94 04 36 05 EA 86 4D 5A 4A 4C C2 47 A7 9D 10 37 78 B4 64

    Comment


    • #3
      I assume you mean that you received the error “COM port detected a Receive Overrun Error”?

      This warning is related to load on the COM port and the CPUs ability to service the COM port interrupts (ie. CPU load and interrupt priorities). The COM port typically has two 16 byte buffers (on a UART/Super IO chip). One for transmit and one for receive. The "COM port detected a Receive Overrun Error" occurs if data is received by the UART receive buffer when the receive FIFO buffer is full. That is, before a COM port interrupt is serviced by the CPU to copy the receive buffer to RAM. The COM port interrupt trigger level is a Windows setting, this is typically defaulted to 14 - ie. generate a CPU interrupt when 14 of the 16 bytes in the UART buffer are full. In this case the last byte in the UART receive FIFO is overwritten and that data is lost.

      You could try changing:
      (1) the COM port speed (reduce)
      (2) the COM port duty cycle (reduce)
      (3) the UART settings (please see: http://support.Microsoft.com/kb/131016/EN-US/ ). For example reduce both buffers to 8 (ensure that the transmit FIFO is not larger than the receive FIFO).
      (4) If you are running the CD test at the same time, then you should check that the CD/DVD drive is on a controller that is set to DMA rather than PIO (as PIO will lead to a large number of CPU interrupts, which will impact the servicing of the COM port interrupt). I understand that the controller can automatically drop from DMA to PIO mode based on the number errors (such as a badly scratched CD/DVD).

      Regards,
      Ian

      Comment


      • #4
        The optical drive is in DMA mode with the serial drive set to use 8-bytes for the high-water mark with no change. I had to give the system up so that our test engineer can look into this in detail.

        We have found that it does appear to be a CPU starvation issue that may possibly be tied to a mini-PCI RAID adapter.

        Comment


        • #5
          Thanks for the feedback. It is appreciated.

          Regards,
          Ian

          Comment

          Working...
          X