Unexpected CP Panic and module Reset

Hello,

Getting CP Panic on my EC200UCN-AA module, running opencpu sdk. I have a custom application running with the GPS demo code from the SDK.
Getting unexpected resets. To resolve this, I want to understand how we can debug the chip better?
I’m using the coolwatcher debug suite which is what the Vendor provided, and it just prints AP logs.
Need to understand:

  1. What is causing the crash
  2. How can we debug this better, have more info like stack info

@Victor.W This is critical and blocking, please assist here.

Please add more custom logs to track the code execution.

Not sure what you mean by ‘custom logs’ just more AP lop traces from coolwatcher ?

You can add your own log in the application.
For example, in gnss demo, use:

QL_GNSSDEMO_LOG("Hello!");

Then you can track down where did the code stoped.

The code has plenty of AP logs, the crash happens unexpectedly and is inconsistent in terms of:

  1. Total duration of code execution.
  2. What thread is running.
  3. Last AP Log also keeps changing.

Sighting that, need to understand what are some more detailed ways of debugging code on the chip, like looking at the stack right before the blue screen event, decoding the dump message; investigate based on the info in the dump message.

If you mean something like a JTAG debugging, it’s not possible on the chip.
Adding custom logs is as accurate as it can get. You can filter out your logs in the collwatcher application.

Also, you may try disabling each task one by one. This way you can narrow down which task is causing the crash.

Honestly,
That seems very off, no toolchain embedded or not can run without having a lens into chip internals like memory/ stack.
There are also threads printing memory dumps at the time of the crash with info like program counter, stack pointer value.
There are so many features in the coolhost suite itself with no documentation on how to use anything, like blue screen, memory dumps.
There are also the CP log port printing CP logs. I highly doubt that this printf style AP log is the only way of debugging on this chip.

As you have said, the thread/log is inconsistent across each crashes. Meaning that those information are not very reliable.
If you have the PC value, you can sort of locate where the function is by looking into the .map files that’s generated along with the firmware.
If you want to check out the CP logs, you need another tool called ArmTracer. You should be able to get this tool from your supplier.

Ruling out problematic tasks is the most direct way of finding out problems.
If GNSS task seems suspicious, just disable it and see if it crashes again.