[autobuilder][PATCH v3 0/5] generate regression reports against proper releases


Alexis Lothoré <alexis.lothore@...>
 

Hi, this is the 3nd version of work initiated to improve regression reports
usefulness, started around issue YOCTO #1465
(https://bugzilla.yoctoproject.org/show_bug.cgi?id=14065).

Changes since v2:
- add symlink to preserve compatibility with current autobuilder configuration

Changes since v1:
- minor rework to be able to import send_qa_email.py as a standard python
module
- properly manage non-releases versions since qe_send_email.py can be called
with such "release" versions ("-r" parameter)
- add unit tests on previous version computation
- do not fetch yocto-testresults full history: identify needed revisions with
git ls-remote and retrieve them with git fetch

Alexis Lothoré (5):
scripts/send_qa_email.py: Rename send-qa-email to send_qa_email.py
scripts/send_qa_email.py: Wrap send_qa_email.py content in function
scripts/send-qa-email: Generate regression reports against most
relevant release
scripts/send_qa_email.py: add unit tests on previous version
computation
scripts/send-qa-email: add symlink to preserve compatibility with
autobuilder

scripts/send-qa-email | 165 +------------------------
scripts/send_qa_email.py | 223 ++++++++++++++++++++++++++++++++++
scripts/shared-repo-unpack | 2 +-
scripts/test_send_qa_email.py | 57 +++++++++
scripts/utils.py | 47 +++++++
5 files changed, 329 insertions(+), 165 deletions(-)
mode change 100755 => 120000 scripts/send-qa-email
create mode 100755 scripts/send_qa_email.py
create mode 100755 scripts/test_send_qa_email.py

--
2.39.0


Richard Purdie
 

On Tue, 2023-01-24 at 18:30 +0100, Alexis Lothoré via
lists.yoctoproject.org wrote:
Hi, this is the 3nd version of work initiated to improve regression reports
usefulness, started around issue YOCTO #1465
(https://bugzilla.yoctoproject.org/show_bug.cgi?id=14065).

Changes since v2:
- add symlink to preserve compatibility with current autobuilder configuration

Changes since v1:
- minor rework to be able to import send_qa_email.py as a standard python
module
- properly manage non-releases versions since qe_send_email.py can be called
with such "release" versions ("-r" parameter)
- add unit tests on previous version computation
- do not fetch yocto-testresults full history: identify needed revisions with
git ls-remote and retrieve them with git fetch

Alexis Lothoré (5):
scripts/send_qa_email.py: Rename send-qa-email to send_qa_email.py
scripts/send_qa_email.py: Wrap send_qa_email.py content in function
scripts/send-qa-email: Generate regression reports against most
relevant release
scripts/send_qa_email.py: add unit tests on previous version
computation
scripts/send-qa-email: add symlink to preserve compatibility with
autobuilder

scripts/send-qa-email | 165 +------------------------
scripts/send_qa_email.py | 223 ++++++++++++++++++++++++++++++++++
scripts/shared-repo-unpack | 2 +-
scripts/test_send_qa_email.py | 57 +++++++++
scripts/utils.py | 47 +++++++
5 files changed, 329 insertions(+), 165 deletions(-)
mode change 100755 => 120000 scripts/send-qa-email
create mode 100755 scripts/send_qa_email.py
create mode 100755 scripts/test_send_qa_email.py
Thanks. Since I like living dangerously, I've put these on the master
branch for the M2 release build. I wanted to see how this does in a
real world test.

I did notice one issue which we'll need to fix in some follow up
patches, I'll reply to the patch about that.

Cheers,

Richard


Alexis Lothoré <alexis.lothore@...>
 

On 1/26/23 23:42, Richard Purdie wrote:
On Tue, 2023-01-24 at 18:30 +0100, Alexis Lothoré via
lists.yoctoproject.org wrote:
Thanks. Since I like living dangerously, I've put these on the master
branch for the M2 release build. I wanted to see how this does in a
real world test.
Thanks. Please feel free to reach back if you observe any unwanted side effect,
since most of my knowledge is based on scripts reading and autobuilder builds
history. I have tried to test it as much as possible on my machine, but as I was
discussing with Alexandre Belloni, the many stubs and fake data used do not make
the tests as accurate as the real world case.

I did notice one issue which we'll need to fix in some follow up
patches, I'll reply to the patch about that.
ACK, I am working on it

--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Richard Purdie
 

On Fri, 2023-01-27 at 09:26 +0100, Alexis Lothoré wrote:
On 1/26/23 23:42, Richard Purdie wrote:
On Tue, 2023-01-24 at 18:30 +0100, Alexis Lothoré via
lists.yoctoproject.org wrote:
Thanks. Since I like living dangerously, I've put these on the master
branch for the M2 release build. I wanted to see how this does in a
real world test.
Thanks. Please feel free to reach back if you observe any unwanted side effect,
since most of my knowledge is based on scripts reading and autobuilder builds
history. I have tried to test it as much as possible on my machine, but as I was
discussing with Alexandre Belloni, the many stubs and fake data used do not make
the tests as accurate as the real world case.
The result of the M2 build is here, specifically the regression report:

https://autobuilder.yocto.io/pub/releases/yocto-4.2_M2.rc2/testresults/testresult-regressions-report.txt

(warning, it is a 13MB file, which may hint at some problems)

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/4844/steps/29/logs/stdio

was the log from running the script. It also suggests there may be a
few issues.

I think we need some improvements to the report. Firstly it should
state at the top which revisions are being compared and any tags they
correspond to. 

Next, we need to understand why the large chunk of tests "disappeared".
it may be related to the "Failed to retrieved" messages?

I also happen to know the reproducibility tests failed in this build,
yet the regression report has no mention of that.

So the patches are a start but there is much more to do.

Cheers,

Richard


Alexis Lothoré <alexis.lothore@...>
 

On 1/27/23 10:51, Richard Purdie wrote:
On Fri, 2023-01-27 at 09:26 +0100, Alexis Lothoré wrote:
On 1/26/23 23:42, Richard Purdie wrote:
On Tue, 2023-01-24 at 18:30 +0100, Alexis Lothoré via
lists.yoctoproject.org wrote:
Thanks. Since I like living dangerously, I've put these on the master
branch for the M2 release build. I wanted to see how this does in a
real world test.
Thanks. Please feel free to reach back if you observe any unwanted side effect,
since most of my knowledge is based on scripts reading and autobuilder builds
history. I have tried to test it as much as possible on my machine, but as I was
discussing with Alexandre Belloni, the many stubs and fake data used do not make
the tests as accurate as the real world case.
The result of the M2 build is here, specifically the regression report:

https://autobuilder.yocto.io/pub/releases/yocto-4.2_M2.rc2/testresults/testresult-regressions-report.txt

(warning, it is a 13MB file, which may hint at some problems)

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/4844/steps/29/logs/stdio

was the log from running the script. It also suggests there may be a
few issues.

I think we need some improvements to the report. Firstly it should
state at the top which revisions are being compared and any tags they
correspond to. 
Yes, I was thinking about this while tweaking send-qa-email. Another point I
need to understand is how the regression report is managed if resulttool finds
multiple tags in results dir concerning the same yocto revision. I feel that the
result is not the same if it finds one tag or multiple tags for the same
revision. Does it really make sense to check regression against multiple tags
referencing the same yocto revision ? If so, indeed we should make the tag name
appear. If not, how to distinguish which tag is appropriate ? This question is
more linked to resulttool internals than to send-qa-email.

Next, we need to understand why the large chunk of tests "disappeared".
it may be related to the "Failed to retrieved" messages?

I also happen to know the reproducibility tests failed in this build,
yet the regression report has no mention of that.
ACK, I will take a look.

So the patches are a start but there is much more to do.
Regards,

--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com