Re: Eclipse GDB setup issues; can't determine cause for SIGSEGV

I have a AT91SAM9G25 system that has been idle for a couple years
morty, yocto version 2.2.1) and I am working on updating to the
production branch. Before I get there, I'm trying to confirm the
old setup and I'm having problems with remote debugging. I know
there have been changes since yocto version 2.7 for debugging
support, so I want to make sure I can get the old setup to work
first prior to changing everything. I'm looking for assistance in tracking
down my debug issues.

My stable production image is based off of core-image-minimal,
with a few additional packages for our proprietary applications
are written in C). I also have a development image, based off of
our production image, with the following additions:

IMAGE_FEATURES += "package-management dev-pkgs eclipse-debug
empty-password empty-root-password"

#same additional packages as production image \
#"-dbg" version of proprietary applications \
gdbserver \
# Strip python from the image to reduce the image size

I have the Eclipse Yocto plugin installed and it is setup to use
the SDK that I have built based on the development image. I've
confirmed that I can start
debug session on one of our proprietary applications. I can set
breakpoints and run the debugger. However, the debugger always
stops at the first call to uuid_compare with a SIGSEGV. The last
line in the call stack states "<symbol is not available>
0x00000000". From my understanding, the stack pointer is getting
set to NULL when uuid_compare is getting called. If I stop the
debugger and just run the application on the hardware, the
application runs without errors. I have confirmed with syslog
messages that I do not have the same NULL stack pointer issue when
I run the application outside
the debugger.

Any suggestions on where to start looking? I don't see any
causes and I don't know where to start looking for the problem.
On a whim, I changed in my code:
if(uuid_compare(uuid1, uuid2) == 0)

If(memcmp(uuid1, uuid2, 16) == 0)

After this change the problematic line of code worked just fine.
The debugger worked fine until I got to the next spot in my code
that called uuid_compare. At the next call to uuid_compare I got
the same SIGSEGV error I had before. Something is clearly a problem
with calling uuid_compare.
However, I'm using several other functions from the uuid library
(uuid_is_null, uuid_parse, uuid_unparse for a few) and none of them
are causing problems. I don't think it's a problem with the input
parameters because I'm passing the same UUIDs to memcomp as I did to
Has anyone ever seen only one function from a library cause problems
like this?
I don't know if this is related, but I noticed that the debugger cannot step
into any library functions. For example, I tried stepping into strncpy, and the
debugger stepped over the function call. I switched to assembly instruction
stepping mode, then when I stepped into strncpy I received the message
"No source available for strncpy@plt". I continued single stepping and was
able to step through all the assembly code for strncpy. I then instruction
single stepped into uuid_compare, and I was able to single step without issue
until I got to the following instruction:
bx r12
When this instruction is called, r12 is 0. I'm a little confused why this library
function call is failing in this manner (when all the input parameters to
uuid_compare are valid), and I'm not sure why I can't source step through
any library functions. If anyone has any suggestions on either issue, please
let me know.

It seems you are using thumb1 ISA, and usually, there is some sort of veneer
code to switch from arm mode to thumb mode which might employ this kind
of indirect jumps perhaps in e2fsprogs recipe, you can add
INHIBIT_PACKAGE_STRIP = "1" temporarily and load the image, perhaps that
can give you better debugging experience, but it seems there is a code-gen
bug as it seems.
Thanks for the suggestions. I tried these suggestions and it didn't help. I replaced all the uuid_compare calls in my code and I was able to do some debugging. But then I started having other odd problems with SIGABRT and SIGSEGV errors at points that normally work.

I then merged in some code from another developer which is using an older SDK. I wasn't able to build any more and determined the makefiles and configure needed updated (autotools based program). A simple autoreconf didn't fix it; I had to invoke aclocal, autoconf, automake and then autoreconf. After all these changes, I can now debug my application without it crashing. The uuid_compare calls are working just fine and I've been able to run the debugger and set breakpoints in many parts of my code. I don't fully understand why it was the issue, but it appears I needed to do a more in-depth autotools reconfiguration for debugging to work with my copy of the SDK.




