Re: [PATCH yocto-autobuilder-helper 1/4] config.json: add "collect-data" template -- Build: 20210420-1
Summary:
Build: 20210420-1 had 23 triggers and isn't really worth analyzing. Stay tuned for what is hopefully better data with a higher timeout threshold. There is a nice looking pair of graph attached though! :) ../Randy On 2021-04-15 4:48 p.m., Randy MacLeod wrote: On 2021-04-15 1:55 p.m., Randy MacLeod wrote: Should we increase the interval to 30, 60, ore more seconds?I've bumped the interval to 60 seconds. I spent some time looking at the first bit of data along with Sakib and Saul from time to time. General conclusions:Still true. 2. xz might be a problem but we're not sure yet.Added to graph attached. 3. We need more data and tools and time to think about it. To Do:Still to do, maybe. It's not clear that it's useful yet. 2. sometimes we see:With a 60 second interval, this does NOT happen. 3. tail the cooker console in addition to top. Present that before top.Still to do. This can be done via the cooker log. We can parse the file and do: 1. list recent tasks regardless of whether they have completed: tail -20? 2. list tasks that have started but not completed. This email is about: https://autobuilder.yocto.io/pub/non-release/20210420-1/ It was a 'quick' build. There was only 1 log file produced. It used the master so the timeout was still 5 seconds. There were 23 (!!) times that the dd time exceeded the 15 (ACTUALLY 5) second limit out of a total of 1504 invocations and those triggers were captured by 1 log file: testresults/qemux86-world/2021-04-20--03-43/host_stats_0_top.txt Splitting each top output into a separate file as before, now we have 23 log files: How big are these files, ie how many process/kernel threads were running when top ran? $ wc -l host-stats-0* | sort -n 699 host-stats-03:42:40--23.log 704 host-stats-02:20:41--5.log 720 host-stats-03:33:41--21.log 733 host-stats-02:19:40--4.log 743 host-stats-03:32:41--20.log 752 host-stats-02:25:45--6.log 753 host-stats-02:02:46--3.log 760 host-stats-03:36:07--22.log 784 host-stats-02:01:48--2.log 784 host-stats-03:30:43--18.log 802 host-stats-02:45:55--10.log 807 host-stats-03:31:42--19.log 816 host-stats-03:29:49--17.log 829 host-stats-03:19:50--15.log 845 host-stats-02:41:40--8.log 851 host-stats-02:00:57--1.log 899 host-stats-02:32:47--7.log 906 host-stats-03:18:41--14.log 925 host-stats-03:28:24--16.log 947 host-stats-02:42:41--9.log 1084 host-stats-03:16:50--13.log 1204 host-stats-02:54:43--11.log 1314 host-stats-03:03:41--12.log 19661 total I noticed that several but not all log files were running xz: $ for i in `ls host-stats-*`; do echo -n $i ": "; grep "xz " $i | wc -l; done host-stats-02:00:57--1.log : 0 host-stats-02:01:48--2.log : 0 host-stats-02:02:46--3.log : 0 host-stats-02:19:40--4.log : 2 host-stats-02:20:41--5.log : 0 host-stats-02:25:45--6.log : 2 host-stats-02:32:47--7.log : 0 host-stats-02:41:40--8.log : 1 host-stats-02:42:41--9.log : 2 host-stats-02:45:55--10.log : 1 host-stats-02:54:43--11.log : 0 host-stats-03:03:41--12.log : 0 host-stats-03:16:50--13.log : 47 host-stats-03:18:41--14.log : 0 host-stats-03:19:50--15.log : 9 host-stats-03:28:24--16.log : 15 host-stats-03:29:49--17.log : 0 host-stats-03:30:43--18.log : 0 host-stats-03:31:42--19.log : 7 host-stats-03:32:41--20.log : 7 host-stats-03:33:41--21.log : 11 host-stats-03:36:07--22.log : 15 host-stats-03:42:40--23.log : 9 I've plotted the number of xz processes along with the load average and the dd time in the attached graph. All of the top output logs seems to be running oe-selftest: for i in host-stats-0*; do grep -H -c "DISPLAY.*oe-selftest " $i ; done host-stats-02:00:57--1.log:1 host-stats-02:01:48--2.log:1 host-stats-02:02:46--3.log:1 host-stats-02:19:40--4.log:1 host-stats-02:20:41--5.log:1 host-stats-02:25:45--6.log:1 host-stats-02:32:47--7.log:1 host-stats-02:41:40--8.log:2 host-stats-02:42:41--9.log:2 host-stats-02:45:55--10.log:2 host-stats-02:54:43--11.log:2 host-stats-03:03:41--12.log:2 host-stats-03:16:50--13.log:2 host-stats-03:18:41--14.log:2 host-stats-03:19:50--15.log:2 host-stats-03:28:24--16.log:2 host-stats-03:29:49--17.log:2 host-stats-03:30:43--18.log:2 host-stats-03:31:42--19.log:2 host-stats-03:32:41--20.log:2 host-stats-03:33:41--21.log:2 host-stats-03:36:07--22.log:2 host-stats-03:42:40--23.log:2 The logs DO seem to be clustered in that there are several 1 minute adjacent intervals where the dd time exceeded the threshold. Data here but you can see this in the attached graph. $ for i in host-stats-*; do echo -n $i ": "; head -1 $i | cut -c -15; done host-stats-02:00:57--1.log : top - 02:00:57 host-stats-02:01:48--2.log : top - 02:01:48 host-stats-02:02:46--3.log : top - 02:02:46 host-stats-02:19:40--4.log : top - 02:19:40 host-stats-02:20:41--5.log : top - 02:20:41 host-stats-02:25:45--6.log : top - 02:25:45 host-stats-02:32:47--7.log : top - 02:32:47 host-stats-02:41:40--8.log : top - 02:41:40 host-stats-02:42:41--9.log : top - 02:42:41 host-stats-02:45:55--10.log : top - 02:45:55 host-stats-02:54:43--11.log : top - 02:54:43 host-stats-03:03:41--12.log : top - 03:03:41 host-stats-03:16:50--13.log : top - 03:16:50 host-stats-03:18:41--14.log : top - 03:18:41 host-stats-03:19:50--15.log : top - 03:19:50 host-stats-03:28:24--16.log : top - 03:28:24 host-stats-03:29:49--17.log : top - 03:29:49 host-stats-03:30:43--18.log : top - 03:30:43 host-stats-03:31:42--19.log : top - 03:31:42 host-stats-03:32:41--20.log : top - 03:32:41 host-stats-03:33:41--21.log : top - 03:33:41 host-stats-03:36:07--22.log : top - 03:36:07 host-stats-03:42:40--23.log : top - 03:42:40 *** Add time distribution of dd. Oddly enough, the output of dd does not seem to be consistent: 102400 bytes (102 kB) copied, 0.0163975 s, 6.2 MB/s 102400 bytes (102 kB) copied, 0.0054732 s, 18.7 MB/s 102400 bytes (102 kB, 100 KiB) copied, 0.0984687 s, 1.0 MB/s 102400 bytes (102 kB, 100 KiB) copied, 0.304233 s, 337 kB/s I'm not sure what's going on there, yet. Sakib? I think the cooker output will be quite useful. I'm not going to put any more time into this data set since the time threshold is still 5 seconds. All for now. ../Randy -- # Randy MacLeod # Wind River Linux
|
|