Eclipse GDB setup issues; can't determine cause for SIGSEGV


Bryan Evenson
 

All,

I have a AT91SAM9G25 system that has been idle for a couple years (running morty, yocto version 2.2.1) and I am working on updating to the latest Yocto production branch. Before I get there, I'm trying to confirm the old setup and I'm having problems with remote debugging. I know there have been changes since yocto version 2.7 for debugging support, so I want to make sure I can get the old setup to work first prior to changing everything. I'm looking for assistance in tracking down my debug issues.

My stable production image is based off of core-image-minimal, with a few additional packages for our proprietary applications (proprietary applications are written in C). I also have a development image, based off of our production image, with the following additions:

IMAGE_FEATURES += "package-management dev-pkgs eclipse-debug allow-empty-password empty-root-password"

IMAGE_INSTALL += " \
#same additional packages as production image \
#"-dbg" version of proprietary applications \
gdbserver \
"
# Strip python from the image to reduce the image size
PACKAGE_EXCLUDE = "python"

I have the Eclipse Yocto plugin installed and it is setup to use the SDK that I have built based on the development image. I've confirmed that I can start a debug session on one of our proprietary applications. I can set breakpoints and run the debugger. However, the debugger always stops at the first call to uuid_compare with a SIGSEGV. The last line in the call stack states "<symbol is not available> 0x00000000". From my understanding, the stack pointer is getting set to NULL when uuid_compare is getting called. If I stop the debugger and just run the application on the hardware, the application runs without errors. I have confirmed with syslog messages that I do not have the same NULL stack pointer issue when I run the application outside of the debugger.

Any suggestions on where to start looking? I don't see any obvious possible causes and I don't know where to start looking for the problem.

Thanks,
Bryan


Bryan Evenson
 

All,

-----Original Message-----
From: yocto@... <yocto@...> On
Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 11:16 AM
To: yocto@...
Subject: [yocto] Eclipse GDB setup issues; can't determine cause for SIGSEGV

All,

I have a AT91SAM9G25 system that has been idle for a couple years (running
morty, yocto version 2.2.1) and I am working on updating to the latest Yocto
production branch. Before I get there, I'm trying to confirm the old setup
and I'm having problems with remote debugging. I know there have been
changes since yocto version 2.7 for debugging support, so I want to make
sure I can get the old setup to work first prior to changing everything. I'm
looking for assistance in tracking down my debug issues.

My stable production image is based off of core-image-minimal, with a few
additional packages for our proprietary applications (proprietary applications
are written in C). I also have a development image, based off of our
production image, with the following additions:

IMAGE_FEATURES += "package-management dev-pkgs eclipse-debug allow-
empty-password empty-root-password"

IMAGE_INSTALL += " \
#same additional packages as production image \
#"-dbg" version of proprietary applications \
gdbserver \
"
# Strip python from the image to reduce the image size
PACKAGE_EXCLUDE = "python"

I have the Eclipse Yocto plugin installed and it is setup to use the SDK that I
have built based on the development image. I've confirmed that I can start a
debug session on one of our proprietary applications. I can set breakpoints
and run the debugger. However, the debugger always stops at the first call
to uuid_compare with a SIGSEGV. The last line in the call stack states
"<symbol is not available> 0x00000000". From my understanding, the stack
pointer is getting set to NULL when uuid_compare is getting called. If I stop
the debugger and just run the application on the hardware, the application
runs without errors. I have confirmed with syslog messages that I do not
have the same NULL stack pointer issue when I run the application outside of
the debugger.

Any suggestions on where to start looking? I don't see any obvious possible
causes and I don't know where to start looking for the problem.
On a whim, I changed in my code:
if(uuid_compare(uuid1, uuid2) == 0)

To:
If(memcmp(uuid1, uuid2, 16) == 0)

After this change the problematic line of code worked just fine. The debugger worked fine until I got to the next spot in my code that called uuid_compare. At the next call to uuid_compare I got the same SIGSEGV error I had before. Something is clearly a problem with calling uuid_compare. However, I'm using several other functions from the uuid library (uuid_is_null, uuid_parse, uuid_unparse for a few) and none of them are causing problems. I don't think it's a problem with the input parameters because I'm passing the same UUIDs to memcomp as I did to uuid_compare. Has anyone ever seen only one function from a library cause problems like this?

Thanks,
Bryan


Bryan Evenson
 

All,

-----Original Message-----
From: yocto@... <yocto@...> On
Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 4:46 PM
To: Bryan Evenson <bevenson@...>;
yocto@...
Subject: Re: [yocto] Eclipse GDB setup issues; can't determine cause for
SIGSEGV

All,

-----Original Message-----
From: yocto@... <yocto@...> On
Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 11:16 AM
To: yocto@...
Subject: [yocto] Eclipse GDB setup issues; can't determine cause for
SIGSEGV

All,

I have a AT91SAM9G25 system that has been idle for a couple years
(running
morty, yocto version 2.2.1) and I am working on updating to the latest
Yocto
production branch. Before I get there, I'm trying to confirm the old setup
and I'm having problems with remote debugging. I know there have been
changes since yocto version 2.7 for debugging support, so I want to make
sure I can get the old setup to work first prior to changing everything. I'm
looking for assistance in tracking down my debug issues.

My stable production image is based off of core-image-minimal, with a few
additional packages for our proprietary applications (proprietary
applications
are written in C). I also have a development image, based off of our
production image, with the following additions:

IMAGE_FEATURES += "package-management dev-pkgs eclipse-debug
allow-
empty-password empty-root-password"

IMAGE_INSTALL += " \
#same additional packages as production image \
#"-dbg" version of proprietary applications \
gdbserver \
"
# Strip python from the image to reduce the image size
PACKAGE_EXCLUDE = "python"

I have the Eclipse Yocto plugin installed and it is setup to use the SDK that I
have built based on the development image. I've confirmed that I can start
a
debug session on one of our proprietary applications. I can set breakpoints
and run the debugger. However, the debugger always stops at the first call
to uuid_compare with a SIGSEGV. The last line in the call stack states
"<symbol is not available> 0x00000000". From my understanding, the stack
pointer is getting set to NULL when uuid_compare is getting called. If I stop
the debugger and just run the application on the hardware, the application
runs without errors. I have confirmed with syslog messages that I do not
have the same NULL stack pointer issue when I run the application outside
of
the debugger.

Any suggestions on where to start looking? I don't see any obvious
possible
causes and I don't know where to start looking for the problem.
On a whim, I changed in my code:
if(uuid_compare(uuid1, uuid2) == 0)

To:
If(memcmp(uuid1, uuid2, 16) == 0)

After this change the problematic line of code worked just fine. The
debugger worked fine until I got to the next spot in my code that called
uuid_compare. At the next call to uuid_compare I got the same SIGSEGV
error I had before. Something is clearly a problem with calling uuid_compare.
However, I'm using several other functions from the uuid library
(uuid_is_null, uuid_parse, uuid_unparse for a few) and none of them are
causing problems. I don't think it's a problem with the input parameters
because I'm passing the same UUIDs to memcomp as I did to uuid_compare.
Has anyone ever seen only one function from a library cause problems like
this?
I don't know if this is related, but I noticed that the debugger cannot step into any library functions. For example, I tried stepping into strncpy, and the debugger stepped over the function call. I switched to assembly instruction stepping mode, then when I stepped into strncpy I received the message "No source available for strncpy@plt". I continued single stepping and was able to step through all the assembly code for strncpy. I then instruction single stepped into uuid_compare, and I was able to single step without issue until I got to the following instruction:
bx r12
When this instruction is called, r12 is 0. I'm a little confused why this library function call is failing in this manner (when all the input parameters to uuid_compare are valid), and I'm not sure why I can't source step through any library functions. If anyone has any suggestions on either issue, please let me know.

Thanks,
Bryan


Thanks,
Bryan


Khem Raj
 

On Tue, Jun 2, 2020 at 6:52 AM Bryan Evenson <bevenson@...> wrote:

All,


-----Original Message-----
From: yocto@... <yocto@...> On
Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 4:46 PM
To: Bryan Evenson <bevenson@...>;
yocto@...
Subject: Re: [yocto] Eclipse GDB setup issues; can't determine cause for
SIGSEGV

All,

-----Original Message-----
From: yocto@... <yocto@...> On
Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 11:16 AM
To: yocto@...
Subject: [yocto] Eclipse GDB setup issues; can't determine cause for
SIGSEGV

All,

I have a AT91SAM9G25 system that has been idle for a couple years
(running
morty, yocto version 2.2.1) and I am working on updating to the latest
Yocto
production branch. Before I get there, I'm trying to confirm the old setup
and I'm having problems with remote debugging. I know there have been
changes since yocto version 2.7 for debugging support, so I want to make
sure I can get the old setup to work first prior to changing everything. I'm
looking for assistance in tracking down my debug issues.

My stable production image is based off of core-image-minimal, with a few
additional packages for our proprietary applications (proprietary
applications
are written in C). I also have a development image, based off of our
production image, with the following additions:

IMAGE_FEATURES += "package-management dev-pkgs eclipse-debug
allow-
empty-password empty-root-password"

IMAGE_INSTALL += " \
#same additional packages as production image \
#"-dbg" version of proprietary applications \
gdbserver \
"
# Strip python from the image to reduce the image size
PACKAGE_EXCLUDE = "python"

I have the Eclipse Yocto plugin installed and it is setup to use the SDK that I
have built based on the development image. I've confirmed that I can start
a
debug session on one of our proprietary applications. I can set breakpoints
and run the debugger. However, the debugger always stops at the first call
to uuid_compare with a SIGSEGV. The last line in the call stack states
"<symbol is not available> 0x00000000". From my understanding, the stack
pointer is getting set to NULL when uuid_compare is getting called. If I stop
the debugger and just run the application on the hardware, the application
runs without errors. I have confirmed with syslog messages that I do not
have the same NULL stack pointer issue when I run the application outside
of
the debugger.

Any suggestions on where to start looking? I don't see any obvious
possible
causes and I don't know where to start looking for the problem.
On a whim, I changed in my code:
if(uuid_compare(uuid1, uuid2) == 0)

To:
If(memcmp(uuid1, uuid2, 16) == 0)

After this change the problematic line of code worked just fine. The
debugger worked fine until I got to the next spot in my code that called
uuid_compare. At the next call to uuid_compare I got the same SIGSEGV
error I had before. Something is clearly a problem with calling uuid_compare.
However, I'm using several other functions from the uuid library
(uuid_is_null, uuid_parse, uuid_unparse for a few) and none of them are
causing problems. I don't think it's a problem with the input parameters
because I'm passing the same UUIDs to memcomp as I did to uuid_compare.
Has anyone ever seen only one function from a library cause problems like
this?
I don't know if this is related, but I noticed that the debugger cannot step into any library functions. For example, I tried stepping into strncpy, and the debugger stepped over the function call. I switched to assembly instruction stepping mode, then when I stepped into strncpy I received the message "No source available for strncpy@plt". I continued single stepping and was able to step through all the assembly code for strncpy. I then instruction single stepped into uuid_compare, and I was able to single step without issue until I got to the following instruction:
bx r12
When this instruction is called, r12 is 0. I'm a little confused why this library function call is failing in this manner (when all the input parameters to uuid_compare are valid), and I'm not sure why I can't source step through any library functions. If anyone has any suggestions on either issue, please let me know.
It seems you are using thumb1 ISA, and usually, there is some sort of
veneer code to
switch from arm mode to thumb mode which might employ this kind of
indirect jumps
perhaps in e2fsprogs recipe, you can add INHIBIT_PACKAGE_STRIP = "1" temporarily
and load the image, perhaps that can give you better debugging
experience, but it seems
there is a code-gen bug as it seems.


Thanks,
Bryan


Thanks,
Bryan


Bryan Evenson
 

Khem,

-----Original Message-----
From: Khem Raj <raj.khem@...>
Sent: Tuesday, June 2, 2020 2:20 PM
To: Bryan Evenson <bevenson@...>
Cc: yocto@...
Subject: Re: [yocto] Eclipse GDB setup issues; can't determine cause for
SIGSEGV

On Tue, Jun 2, 2020 at 6:52 AM Bryan Evenson <bevenson@...>
wrote:

All,


-----Original Message-----
From: yocto@... <yocto@...> On
Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 4:46 PM
To: Bryan Evenson <bevenson@...>;
yocto@...
Subject: Re: [yocto] Eclipse GDB setup issues; can't determine cause
for SIGSEGV

All,

-----Original Message-----
From: yocto@... <yocto@...>
On Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 11:16 AM
To: yocto@...
Subject: [yocto] Eclipse GDB setup issues; can't determine cause
for
SIGSEGV

All,

I have a AT91SAM9G25 system that has been idle for a couple years
(running
morty, yocto version 2.2.1) and I am working on updating to the
latest
Yocto
production branch. Before I get there, I'm trying to confirm the
old setup and I'm having problems with remote debugging. I know
there have been changes since yocto version 2.7 for debugging
support, so I want to make sure I can get the old setup to work
first prior to changing everything. I'm looking for assistance in tracking
down my debug issues.

My stable production image is based off of core-image-minimal,
with a few additional packages for our proprietary applications
(proprietary
applications
are written in C). I also have a development image, based off of
our production image, with the following additions:

IMAGE_FEATURES += "package-management dev-pkgs eclipse-debug
allow-
empty-password empty-root-password"

IMAGE_INSTALL += " \
#same additional packages as production image \
#"-dbg" version of proprietary applications \
gdbserver \
"
# Strip python from the image to reduce the image size
PACKAGE_EXCLUDE = "python"

I have the Eclipse Yocto plugin installed and it is setup to use
the SDK that I have built based on the development image. I've
confirmed that I can start
a
debug session on one of our proprietary applications. I can set
breakpoints and run the debugger. However, the debugger always
stops at the first call to uuid_compare with a SIGSEGV. The last
line in the call stack states "<symbol is not available>
0x00000000". From my understanding, the stack pointer is getting
set to NULL when uuid_compare is getting called. If I stop the
debugger and just run the application on the hardware, the
application runs without errors. I have confirmed with syslog
messages that I do not have the same NULL stack pointer issue when
I run the application outside
of
the debugger.

Any suggestions on where to start looking? I don't see any
obvious
possible
causes and I don't know where to start looking for the problem.
On a whim, I changed in my code:
if(uuid_compare(uuid1, uuid2) == 0)

To:
If(memcmp(uuid1, uuid2, 16) == 0)

After this change the problematic line of code worked just fine.
The debugger worked fine until I got to the next spot in my code
that called uuid_compare. At the next call to uuid_compare I got
the same SIGSEGV error I had before. Something is clearly a problem
with calling uuid_compare.
However, I'm using several other functions from the uuid library
(uuid_is_null, uuid_parse, uuid_unparse for a few) and none of them
are causing problems. I don't think it's a problem with the input
parameters because I'm passing the same UUIDs to memcomp as I did to
uuid_compare.
Has anyone ever seen only one function from a library cause problems
like this?
I don't know if this is related, but I noticed that the debugger cannot step
into any library functions. For example, I tried stepping into strncpy, and the
debugger stepped over the function call. I switched to assembly instruction
stepping mode, then when I stepped into strncpy I received the message
"No source available for strncpy@plt". I continued single stepping and was
able to step through all the assembly code for strncpy. I then instruction
single stepped into uuid_compare, and I was able to single step without issue
until I got to the following instruction:
bx r12
When this instruction is called, r12 is 0. I'm a little confused why this library
function call is failing in this manner (when all the input parameters to
uuid_compare are valid), and I'm not sure why I can't source step through
any library functions. If anyone has any suggestions on either issue, please
let me know.

It seems you are using thumb1 ISA, and usually, there is some sort of veneer
code to switch from arm mode to thumb mode which might employ this kind
of indirect jumps perhaps in e2fsprogs recipe, you can add
INHIBIT_PACKAGE_STRIP = "1" temporarily and load the image, perhaps that
can give you better debugging experience, but it seems there is a code-gen
bug as it seems.
Thanks for the suggestions. I tried these suggestions and it didn't help. I replaced all the uuid_compare calls in my code and I was able to do some debugging. But then I started having other odd problems with SIGABRT and SIGSEGV errors at points that normally work.

I then merged in some code from another developer which is using an older SDK. I wasn't able to build any more and determined the makefiles and configure needed updated (autotools based program). A simple autoreconf didn't fix it; I had to invoke aclocal, autoconf, automake and then autoreconf. After all these changes, I can now debug my application without it crashing. The uuid_compare calls are working just fine and I've been able to run the debugger and set breakpoints in many parts of my code. I don't fully understand why it was the issue, but it appears I needed to do a more in-depth autotools reconfiguration for debugging to work with my copy of the SDK.

Thanks,
Bryan


Thanks,
Bryan


Thanks,
Bryan


Khem Raj
 

On Wed, Jun 3, 2020 at 9:41 AM Bryan Evenson <bevenson@...> wrote:

Khem,

-----Original Message-----
From: Khem Raj <raj.khem@...>
Sent: Tuesday, June 2, 2020 2:20 PM
To: Bryan Evenson <bevenson@...>
Cc: yocto@...
Subject: Re: [yocto] Eclipse GDB setup issues; can't determine cause for
SIGSEGV

On Tue, Jun 2, 2020 at 6:52 AM Bryan Evenson <bevenson@...>
wrote:

All,


-----Original Message-----
From: yocto@... <yocto@...> On
Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 4:46 PM
To: Bryan Evenson <bevenson@...>;
yocto@...
Subject: Re: [yocto] Eclipse GDB setup issues; can't determine cause
for SIGSEGV

All,

-----Original Message-----
From: yocto@... <yocto@...>
On Behalf Of Bryan Evenson via lists.yoctoproject.org
Sent: Monday, June 1, 2020 11:16 AM
To: yocto@...
Subject: [yocto] Eclipse GDB setup issues; can't determine cause
for
SIGSEGV

All,

I have a AT91SAM9G25 system that has been idle for a couple years
(running
morty, yocto version 2.2.1) and I am working on updating to the
latest
Yocto
production branch. Before I get there, I'm trying to confirm the
old setup and I'm having problems with remote debugging. I know
there have been changes since yocto version 2.7 for debugging
support, so I want to make sure I can get the old setup to work
first prior to changing everything. I'm looking for assistance in tracking
down my debug issues.

My stable production image is based off of core-image-minimal,
with a few additional packages for our proprietary applications
(proprietary
applications
are written in C). I also have a development image, based off of
our production image, with the following additions:

IMAGE_FEATURES += "package-management dev-pkgs eclipse-debug
allow-
empty-password empty-root-password"

IMAGE_INSTALL += " \
#same additional packages as production image \
#"-dbg" version of proprietary applications \
gdbserver \
"
# Strip python from the image to reduce the image size
PACKAGE_EXCLUDE = "python"

I have the Eclipse Yocto plugin installed and it is setup to use
the SDK that I have built based on the development image. I've
confirmed that I can start
a
debug session on one of our proprietary applications. I can set
breakpoints and run the debugger. However, the debugger always
stops at the first call to uuid_compare with a SIGSEGV. The last
line in the call stack states "<symbol is not available>
0x00000000". From my understanding, the stack pointer is getting
set to NULL when uuid_compare is getting called. If I stop the
debugger and just run the application on the hardware, the
application runs without errors. I have confirmed with syslog
messages that I do not have the same NULL stack pointer issue when
I run the application outside
of
the debugger.

Any suggestions on where to start looking? I don't see any
obvious
possible
causes and I don't know where to start looking for the problem.
On a whim, I changed in my code:
if(uuid_compare(uuid1, uuid2) == 0)

To:
If(memcmp(uuid1, uuid2, 16) == 0)

After this change the problematic line of code worked just fine.
The debugger worked fine until I got to the next spot in my code
that called uuid_compare. At the next call to uuid_compare I got
the same SIGSEGV error I had before. Something is clearly a problem
with calling uuid_compare.
However, I'm using several other functions from the uuid library
(uuid_is_null, uuid_parse, uuid_unparse for a few) and none of them
are causing problems. I don't think it's a problem with the input
parameters because I'm passing the same UUIDs to memcomp as I did to
uuid_compare.
Has anyone ever seen only one function from a library cause problems
like this?
I don't know if this is related, but I noticed that the debugger cannot step
into any library functions. For example, I tried stepping into strncpy, and the
debugger stepped over the function call. I switched to assembly instruction
stepping mode, then when I stepped into strncpy I received the message
"No source available for strncpy@plt". I continued single stepping and was
able to step through all the assembly code for strncpy. I then instruction
single stepped into uuid_compare, and I was able to single step without issue
until I got to the following instruction:
bx r12
When this instruction is called, r12 is 0. I'm a little confused why this library
function call is failing in this manner (when all the input parameters to
uuid_compare are valid), and I'm not sure why I can't source step through
any library functions. If anyone has any suggestions on either issue, please
let me know.

It seems you are using thumb1 ISA, and usually, there is some sort of veneer
code to switch from arm mode to thumb mode which might employ this kind
of indirect jumps perhaps in e2fsprogs recipe, you can add
INHIBIT_PACKAGE_STRIP = "1" temporarily and load the image, perhaps that
can give you better debugging experience, but it seems there is a code-gen
bug as it seems.
Thanks for the suggestions. I tried these suggestions and it didn't help. I replaced all the uuid_compare calls in my code and I was able to do some debugging. But then I started having other odd problems with SIGABRT and SIGSEGV errors at points that normally work.

I then merged in some code from another developer which is using an older SDK. I wasn't able to build any more and determined the makefiles and configure needed updated (autotools based program). A simple autoreconf didn't fix it; I had to invoke aclocal, autoconf, automake and then autoreconf. After all these changes, I can now debug my application without it crashing. The uuid_compare calls are working just fine and I've been able to run the debugger and set breakpoints in many parts of my code. I don't fully understand why it was the issue, but it appears I needed to do a more in-depth autotools reconfiguration for debugging to work with my copy of the SDK.
what code did you merge and was it into layers? or your app.
autoreconf helping would
mean that its some pre-generated header etc. I am not sure what was
causing it and not sure
what fixed it

Thanks,
Bryan


Thanks,
Bryan


Thanks,
Bryan