Announcement

Collapse
No announcement yet.

StressTest App Error During DDR3 Stress Testing

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • StressTest App Error During DDR3 Stress Testing

    Dear Robert,

    Hi, this issue is related to a custom board implemented based around IMX6-Rex board.

    > After running DDR Calibration (using V2.6) for many times, I obtained a bit consistent MMDC register values.
    > And these values were accordingly updated in the .cfg file (800mhz_4x256mx16.cfg). This cfg is used to source RAM configurations to the u-boot (2016) we run.
    > After adding only these updated configurations, I compiled the u-boot and successfully loaded it to spi flash.
    > On the OS (I use ubuntu xenial, booting from a SD card), ran StressTest for 30 minutes, and the log I got can be seen in following link :
    https://pastebin.com/0DTWC9zK

    As you can see, there were 5 HW errors reported. So my questions are,

    1. Can we decode these error messages and exactly interpret where the issues were ?
    2. Apart from the changes made in .cfg file, is there any other setting I have to make ?
    3. Since I am running ubuntu on SD card, could this be due to SD card defects ?

    Thanks in Advance
    Anuradha
    Last edited by ranaya; 06-27-2017, 04:45 AM.

  • #2
    PS : Usually how many times, a DDR calibration should be performed ? Until we get the same register values ? In my case even after 10 calibration runs the values were not same to the previous case.

    Comment


    • #3
      1. If you run only memory test, it just should not happen. I am not sure if you really can decode the message and how useful it would be, see point 3
      2. The correct cfg file should fix it.
      3. When we were running tests, I know, that Ethernet errors may appear as memory error. This is from our communication with the author of the test:

      Error:
      stressapptest -s 600000 -M 256 -n 192.168.0.25 --printsec 600
      2015/05/20-05:32:45(UTC) Log: Resuming worker threads to cause a power spike (553185 seconds remaining)
      net eth0: FEC ENET: rcv is not +last
      2015/05/20-05:34:10(UTC) Report Error: miscompare : DIMM Unknown : 1 : 46901s
      2015/05/20-05:34:10(UTC) Hardware Error: miscompare on CPU 3(0xF) at 0x2f532ec8(0x85d23ec8IMM Unknown): read:0x0000000000000000, reread:0x0000000000000000 expected:0xffffffffffffffff


      Robert (myself):
      Thank you very much Nick. I am also suspicious, that it is Ethernet - it may be also something with the laptop, where the stressaptest ethernet server was running. I was just confused, because the error message says, it is DIMM Error.

      Nick (author of the test):
      It looks like the network loopback code doesn't have any particular way to figure out what kind of error it is and just routes everything into the standard error handling (which assumes all errors are probably memory errors). So don't read too much into it. I'll file a feature request to make the reporting clearer.


      However, as I explained in point 1, if you only run memory test, you should be able to run it with no errors. Have a look here: https://www.fedevel.com/welldoneblog...emory-testing/

      PS: You never get same register values - they will be changing a little every time.

      Comment


      • #4
        Hi thanks alot for your input Robert. I was able to make some progress after using the ddr IO drive strength (DS) values according to your cfg file. Running the app for 1-4 hrs now does not report any issues. But however it reported few (5) incidents during 24 hr testing. I am further digging to this matter to see how it goes......

        One one thing I need to get sort, how you ended up with those DS values ? I know that the initial excel sheet states different values !

        Comment


        • #5
          I checked memory settings of other IMX6 boards

          Comment

          Working...
          X