[] [PATCH yocto-autobuilder-helper] config.json: set oe-time-dd-test.sh timeout to 3 seconds
For the month of January 2023, the distribution of dd times has a long
tail that extends to 13 seconds with 2 events exceeding the current limit of 30 seconds. Reduce the timeout to 3 seconds based on the observed distribution of dd times, which would result in the timout triggering about 20 times a month. That's enough data to be useful but not so much that it's overwhelming the logging or the people who will analyze it. It also avoids the rapid increase in the tail of the distribution which starts to rise exponentially under 2 seconds. It's sensible response time for people to expect the system to have. Signed-off-by: Randy MacLeod <Randy.MacLeod@...> --- config.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.json b/config.json index 446528a..e50ec44 100644 --- a/config.json +++ b/config.json @@ -68,7 +68,7 @@ "RUNQEMU_TMPFS_DIR = '/home/pokybuild/tmp'", "BB_HEARTBEAT_EVENT = '60'", "BB_LOG_HOST_STAT_ON_INTERVAL = '1'", - "BB_LOG_HOST_STAT_CMDS_INTERVAL = 'oe-time-dd-test.sh -c 100 -t 15'", + "BB_LOG_HOST_STAT_CMDS_INTERVAL = 'oe-time-dd-test.sh -c 100 -t 3'", "BB_LOG_HOST_STAT_ON_FAILURE = '1'", "BB_LOG_HOST_STAT_CMDS_FAILURE = 'oe-time-dd-test.sh -l'", "SDK_TOOLCHAIN_LANGS += 'rust'", -- 2.34.1 |
|
On 2023-02-02 17:00, Randy MacLeod via
lists.yoctoproject.org wrote:
For the month of January 2023, the distribution of dd times has a long tail that extends to 13 seconds with 2 events exceeding the current limit of 30 seconds. Reduce the timeout to 3 seconds based on the observed distribution of dd times, which would result in the timout triggering about 20 times a month. That's enough data to be useful but not so much that it's overwhelming the logging or the people who will analyze it. It also avoids the rapid increase in the tail of the distribution which starts to rise exponentially under 2 seconds. It's sensible response time for people to expect the system to have.
See attached graphs! I don't know why there are two peaks that you can easily see on
the linear scale distribution Below is histogram data for a 0.1 second bin of the tail of the distribution. I can share the raw data or the 0.001 ms binned version if anyone
is interested. ../Randy 53 1.0 Signed-off-by: Randy MacLeod <Randy.MacLeod@...> --- config.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.json b/config.json index 446528a..e50ec44 100644 --- a/config.json +++ b/config.json @@ -68,7 +68,7 @@ "RUNQEMU_TMPFS_DIR = '/home/pokybuild/tmp'", "BB_HEARTBEAT_EVENT = '60'", "BB_LOG_HOST_STAT_ON_INTERVAL = '1'", - "BB_LOG_HOST_STAT_CMDS_INTERVAL = 'oe-time-dd-test.sh -c 100 -t 15'", + "BB_LOG_HOST_STAT_CMDS_INTERVAL = 'oe-time-dd-test.sh -c 100 -t 3'", "BB_LOG_HOST_STAT_ON_FAILURE = '1'", "BB_LOG_HOST_STAT_CMDS_FAILURE = 'oe-time-dd-test.sh -l'", "SDK_TOOLCHAIN_LANGS += 'rust'",
-- # Randy MacLeod # Wind River Linux |
|
On 2023-02-02 17:10, Randy MacLeod via
lists.yoctoproject.org wrote:
I don't know why there are two peaks that you can easily see on the linear scale distribution I'm guessing but I suspect that the lower latency distribution is just from when the Yocto AB workers are idle. The builders in the cluster at WR
are always -- # Randy MacLeod # Wind River Linux |
|
On 2023-02-02 18:32, Randy MacLeod via
lists.yoctoproject.org wrote:
Sigh, that's obviously wrong since this data is only collected when running bitbake! Maybe it's the arm vs the intel workers? I'll stop speculating until I look at the data a bit more...
# Randy MacLeod # Wind River Linux |
|