Hi Bruce,
On 3/13/23 08:46, Bruce Ashfield wrote:
CAUTION: This email comes from a non Wind River email account!
Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Wed, Mar 8, 2023 at 10:49 PM Xiangyu Chen
<xiangyu.chen@...> wrote:
Hi Bruce,
Sorry for being late,
On 3/8/23 12:00, Bruce Ashfield wrote:
CAUTION: This email comes from a non Wind River email account!
Do not click links or open attachments unless you recognize the sender and know the content is safe.
On Thu, Mar 2, 2023 at 8:17 PM Xiangyu Chen
<xiangyu.chen@...> wrote:
Hi Bruce,
On 3/3/23 05:24, Bruce Ashfield wrote:
CAUTION: This email comes from a non Wind River email account!
Do not click links or open attachments unless you recognize the sender and know the content is safe.
In message: [meta-virtualization][PATCH 0/1] lxc: templates/lxc-busybox.in: if busybox contains init then use it
on 01/03/2023 Xiangyu Chen wrote:
From: Xiangyu Chen <xiangyu.chen@...>
Hi Bruce,
Recently we found that the lxc ptest has lots of failure cases as below log-1, after checking the
code, some cases failed due to related the init progess. For example, lxc-test-exit-code need to
start container as daemon, but if using bash as init, the container cannot start correctly.
Is there an indication of what busybox is providing that bash isn't ?
In my local setup, when using bash as container init, the container
doesn't support "reboot" and cannot start correctly in a daemon mode,
test step as below:
lxc-create -t busybox -n t
lxc-start -n t -d
lxc-ls -f
the container "t" status still in "STOPPED", but when we use a busybox
init instead of bash, the container status is correct in daemon mode.
I'm setting up to test this myself, but generally speaking we should
include this detail in the commit log.
Thanks, if we find the final root cause, I'll add our discussion
information into the commit and send a v2 patch :p
I don't like to force this in the ptest, while for actual lxc containers
we still allow bash, which means it may not be functional.
This is a common template for lxc, but I am not sure whether others
still need bash as lxc container init, so I based the patch
"template-make-busybox-template-compatible-with-core-.patch" to add the
busybox back.
That's the part that concerns me. Why does our bash behave differently
than in other lxc integrations and other distros ?
Currently, the behavior is when using lxc with busybox template in
daemon mode, the status still stay in "STOPPED", but it's working well
in foreground mode.
Do you have the ability to run the same simple tests you have above on
a desktop distro ?
I have setup a virtualbox today and did some test with trace, here is
something I was found:
As above mentioned, lxc working well in foreground mode but something
wrong with daemon mode, according to manual of lxc-start, the foreground
mode attach the tty to /dev/console, but daemon mode doesn't.
And to confirm, this is running on something like ubuntu, using the
ubuntu bash ?
It was a virtualbox VM with openSUSE Leap 15.5, lxc version is 4.0.12, bash version is 4.4.23, busybox using master branch and was built by myself.
When using busybox as init, the container run as daemon is normal:
localhost:~ # lxc-create --version
4.0.12
localhost:~ # lxc-create -t busybox -n t-bb
localhost:~ # lxc-start -n t-bb -d
localhost:~ # lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED
t-bb RUNNING 0 - - - false
localhost:~ # lxc-stop -n t-bb
localhost:~ # lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED
t-bb STOPPED 0 - - - false
localhost:~ #
localhost:~ #
The strace log:
#####log of "strace -f lxc-start -n t-bb -d" #######
execve("/usr/bin/lxc-start", ["lxc-start", "-n", "t-bb", "-d"], 0x7ffe0ef0e300 /* 57 vars */) = 0
... loading and mapping libraries for lxc ...
[pid 23053] execve("/sbin/init", ["/sbin/init"], 0x563be8d3fae0 /* 2 vars */ <unfinished ...>
... loading and mapping libraries for /sbin/init in container...
[pid 23053] reboot(LINUX_REBOOT_MAGIC1, LINUX_REBOOT_MAGIC2, LINUX_REBOOT_CMD_CAD_OFF) = -1 EINVAL (Invalid argument)
[pid 23053] openat(AT_FDCWD, "/dev/null", O_RDWR) = 3
[pid 23053] close(3) = 0
[pid 23053] ioctl(0, VT_OPENQRY, 0x7ffd8e2e5b28) = -1 ENOTTY (Inappropriate ioctl for device)
[pid 23053] brk(NULL) = 0x19ea000
[pid 23053] brk(0x1a0b000) = 0x1a0b000
[pid 23053] ioctl(0, TCGETS, 0x7ffd8e2e5a90) = -1 ENOTTY (Inappropriate ioctl for device)
[pid 23053] chdir("/") = 0
[pid 23053] setsid() = -1 EPERM (Operation not permitted)
[pid 23053] openat(AT_FDCWD, "/etc/inittab", O_RDONLY) = 3
[pid 23053] fstat(3, {st_mode=S_IFREG|0644, st_size=97, ...}) = 0
[pid 23053] read(3, "::sysinit:/etc/init.d/rcS\ntty1::"..., 4096) = 97
[pid 23053] read(3, "", 4096) = 0
[pid 23053] close(3) = 0
.... add signal process callback ....
[pid 23054] execve("/etc/init.d/rcS", ["/etc/init.d/rcS"], 0x19ea2a0 /* 6 vars */ <unfinished ...>
.... loading and mapping libraries for running /etc/init.d/rcS .....
[pid 23055] execve("/bin/syslogd", ["/bin/syslogd"], 0x1308670 /* 8 vars */) = 0
.....
[pid 23056] execve("/bin/mount", ["/bin/mount", "-a"], 0x13086b8 /* 8 vars */ <unfinished ...>
...
[pid 23057] execve("/bin/udhcpc", ["/bin/udhcpc"], 0x1308670 /* 8 vars */) = 0
...
[pid 23058] execve("/bin/getty", ["/bin/getty", "-L", "tty1", "115200", "vt100"], 0x19ea2a0 /* 6 vars */ <unfinished ...>
...
[pid 23061] execve("/bin/sh", ["/bin/sh"], 0x13d02a0 /* 6 vars */ <unfinished ...>
[pid 23061] ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
...
[pid 23061] ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
[pid 23061] openat(AT_FDCWD, "/dev/tty", O_RDWR) = 3
[pid 23061] fcntl(3, F_DUPFD_CLOEXEC, 10) = 10
[pid 23061] close(3 <unfinished ...>
[pid 23061] <... close resumed>) = 0
[pid 23061] ioctl(10, TIOCGPGRP <unfinished ...>
[pid 23061] <... ioctl resumed>, [8]) = 0
[pid 23061] getpgrp() = 8
...
[pid 23061] setpgid(0, 8 <unfinished ...>
[pid 23061] <... setpgid resumed>) = -1 EPERM (Operation not permitted)
[pid 23061] ioctl(10, TIOCSPGRP, [8] <unfinished ...>
[pid 23061] <... ioctl resumed>) = 0
[pid 23061] ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
[pid 23061] ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost -isig -icanon -echo ...} <unfinished ...>
[pid 23061] <... ioctl resumed>) = 0
[pid 23061] ioctl(0, TIOCGWINSZ, {ws_row=0, ws_col=0, ws_xpixel=0, ws_ypixel=0}) = 0
[pid 23061] geteuid() = 0
[pid 23061] openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3
[pid 23061] fstat(3, <unfinished ...>
[pid 23061] <... fstat resumed>{st_mode=S_IFREG|0644, st_size=30, ...}) = 0
[pid 23061] read(3, <unfinished ...>
[pid 23061] <... read resumed>"root:x:0:0:root:/root:/bin/sh\n", 4096) = 30
[pid 23061] close(3 <unfinished ...>
[pid 23061] <... close resumed>) = 0
[pid 23061] geteuid( <unfinished ...>
[pid 23061] <... geteuid resumed>) = 0
[pid 23061] fstat(1, <unfinished ...>
[pid 23061] <... fstat resumed>{st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x2), ...}) = 0
[pid 23061] rt_sigaction(SIGWINCH, {sa_handler=0x4b9ce8, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fc7eecdcd50}, <unfinished ...>
[pid 23061] <... rt_sigaction resumed>{sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
[pid 23061] write(1, "/ # ", 4 <unfinished ...>
[pid 23061] <... write resumed>) = 4
[pid 23061] poll([{fd=0, events=POLLIN}], 1, -1 <unfinished ...>
############end of "strace -f lxc-start -n t-bb -d" ###############
When using bash as init, the container can run in foreground mode:
localhost:~ # lxc-create -t busybox -n t-bash
localhost:~ # lxc-start -n t-bash -d
localhost:~ # lxc-ls -f
NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED
t-bash STOPPED 0 - - - false
t-bb STOPPED 0 - - - false
localhost:~ # lxc-start -n t-bash -F
init-4.4#
init-4.4#
init-4.4# /sbin/init --version
GNU bash, version 4.4.23(1)-release (x86_64-suse-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
init-4.4# exit
exit
localhost:~ #
The strace log:
#####log of "strace -f lxc-start -n bash -d" #######
[pid 22977] execve("/sbin/init", ["/sbin/init"], 0x562a9b3d80f0 /* 2 vars */ <unfinished ...>
.... loading and mapping libraries .......
[pid 22977] openat(AT_FDCWD, "/dev/tty", O_RDWR|O_NONBLOCK) = -1 ENXIO (No such device or address)
[pid 22977] ioctl(0, TCGETS, 0x7ffc1d039fa0) = -1 ENOTTY (Inappropriate ioctl for device)
[pid 22977] stat("/usr/lib/locale/locale-archive", 0x7ffc1d03a110) = -1 ENOENT (No such file or directory)
[pid 22977] brk(NULL) = 0x562bad4cf000
[pid 22977] brk(0x562bad4f0000) = 0x562bad4f0000
[pid 22977] getuid() = 0
[pid 22977] getgid() = 0
[pid 22977] geteuid() = 0
[pid 22977] getegid() = 0
[pid 22977] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid 22977] ioctl(0, TCGETS, 0x7ffc1d03a120) = -1 ENOTTY (Inappropriate ioctl for device)
[pid 22977] ioctl(-1, TIOCGPGRP, 0x7ffc1d03a194) = -1 EBADF (Bad file descriptor)
[pid 22977] sysinfo({uptime=5034, loads=[13568, 5344, 992], totalram=4115214336, freeram=1067577344, sharedram=30674944, bufferram=1994752, totalswap=2148507648, freeswap=2148507648, procs=381, totalhigh=0, freehigh=0, mem_unit=1}) = 0
.... add signal process callback and start process network configurations in /etc ....
[pid 22977] openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
[pid 22977] lseek(3, 0, SEEK_CUR) = 0
[pid 22977] fstat(3, {st_mode=S_IFREG|0644, st_size=30, ...}) = 0
[pid 22977] read(3, "root:x:0:0:root:/root:/bin/sh\n", 4096) = 30
[pid 22977] close(3) = 0
[pid 22977] getppid() = 0
[pid 22977] getpid() = 1
[pid 22977] getpgrp() = 1
[pid 22977] ioctl(2, TIOCGPGRP, 0x7ffc1d03a064) = -1 ENOTTY (Inappropriate ioctl for device)
[pid 22977] rt_sigaction(SIGCHLD, {sa_handler=0x562bacebdb50, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fdb74971d50}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fdb74971d50}, 8) = 0
[pid 22977] ioctl(2, TIOCGPGRP, 0x7ffc1d03a044) = -1 ENOTTY (Inappropriate ioctl for device)
[pid 22977] prlimit64(0, RLIMIT_NPROC, NULL, {rlim_cur=15574, rlim_max=15574}) = 0
[pid 22977] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid 22977] fcntl(0, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
[pid 22977] fstat(0, {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x3), ...}) = 0
[pid 22977] lseek(0, 0, SEEK_CUR) = 0
[pid 22977] read(0, "", 1) = 0
[pid 22977] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 22977] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 22977] exit_group(0) = ?
[pid 22977] +++ exited with 0 +++
#####end of "strace -f lxc-start -n bash -d" #######
Br,
Xiangyu
I was enable the lxc debug trace as below when start a container with
daemon mode (part of init related log and remove timestamp and full
source code path)
##### lxc-start -n t -o /tmp/log.txt -l TRACE -d #####
start - /src/lxc/start.c:post_start:2205 - Started "/sbin/init" with pid
"871"
start - /src/lxc/start.c:lxc_serve_state_clients:483 - Set container
state to RUNNING
<<<<<<<<< we can see the bash as init was starting, and lxc update mode
start - /src/lxc/start.c:lxc_serve_state_clients:486 - No state clients
registered
mainloop - /src/lxc/mainloop.c:__epoll_open:493 - Created epoll instance
mainloop - /mainloop.c:__epoll_open:493 - Created epoll instance
start - /src/lxc/start.c:lxc_poll:626 - Mainloop is ready
start - /src/lxc/start.c:signal_handler:396 - Received signal
ssi_signo(17) for ssi_pid(871), si_signo(17), si_pid(871)
start - /src/lxc/start.c:signal_handler:464 - Container init process 871
exited
<<<<<<<<<< seems something wrong with the init, it exited and lxc got
the exit signal.
start - /src/lxc/start.c:lxc_poll:643 - Closed console mainloop
start - /src/lxc/start.c:lxc_poll:648 - Closed mainloop
start - /src/lxc/start.c:lxc_poll:651 - Closed signal file descriptor 7
..... removed some networking teminating related trace .....
start - /src/lxc/start.c:lxc_serve_state_clients:483 - Set container
state to STOPPING
<<<<<<<<<<< now the lxc set container mode back to stop mode.
start - /src/lxc/start.c:lxc_serve_state_clients:486 - No state clients
registered
##### end of lxc-start -n t -o /tmp/log.txt -l TRACE -d #####
Let's use strace to see what happens in container(part of init related log):
#####strace -s 1024 -f lxc-start -n t -d #####
[pid 1211] execve("/sbin/init", ["/sbin/init"], 0x55a07c90eb30 /* 1 var
*/ <unfinished ...>
......
[pid 1211] ioctl(2, TIOCGPGRP, 0x7fffe212610c) = -1 ENOTTY
(Inappropriate ioctl for device)
[pid 1211] rt_sigaction(SIGCHLD, {sa_handler=0x5632e07dcec0,
sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART,
sa_restorer=0x7f1237db3190}, {sa_handler=SIG_DFL, sa_mask=[],
sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f1237db3190}, 8) = 0
[pid 1211] ioctl(2, TIOCGPGRP, 0x7fffe21260ec) = -1 ENOTTY
(Inappropriate ioctl for device)
[pid 1211] prlimit64(0, RLIMIT_NPROC, NULL, {rlim_cur=3818,
rlim_max=3818}) = 0
[pid 1211] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid 1211] fcntl(0, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
[pid 1211] newfstatat(0, "", {st_mode=S_IFCHR|0666,
st_rdev=makedev(0x1, 0x3), ...}, AT_EMPTY_PATH) = 0
[pid 1211] lseek(0, 0, SEEK_CUR) = 0
[pid 1211] read(0, "", 1) = 0
[pid 1211] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 1211] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 1211] exit_group(0) = ?
[pid 1211] +++ exited with 0 +++
#####end of strace -s 1024 -f lxc-start -n t -d #####
it looks that seems bash stdi/o/err cannot attach the available caused
this issue, right?
It does look like that. When busybox is used as init, do you have a
similar strace ? I'd like to do that comparison.
Bruce
I don't want to force this switch to busybox, without understanding if
we are the only ones seeing this issue .. since that means we are
simply hiding an issue, versus fixing it.
Yes indeed, if someone is using systemd as init, they need to add extra
configurations to setup busybox and keep default systemd init cannot be
replaced by busybox in local.conf .
Thanks,
Xiangyu
Bruce
There are other init options in meta-virt, like the docker tini, if we
had to enforce something, I'd rather that than busybox.
Good to hear the tini :)
Indeed, enable the busybox-init might need lots of additional effort to
take care of system which using systemd, otherwise, /sbin/init always to
be covered by busybox.
Bruce
Br,
Xiangyu
So added a busybox init utils checking in lxc-busybox template, if current system busybox contains
init then use it, after applying this patch, the ptest result as log-2.
######## 1og-1: ptest without patch #######
Starting LXC ptest ###
FAIL: lxc-test-api-reboot
SKIPPED: lxc-test-apparmor
PASS: lxc-test-apparmor-generated
FAIL: lxc-test-apparmor-mount
PASS: lxc-test-arch-parse
FAIL: lxc-test-attach
PASS: lxc-test-automount
FAIL: lxc-test-autostart
PASS: lxc-test-basic
FAIL: lxc-test-capabilities
FAIL: lxc-test-cgpath
PASS: lxc-test-checkpoint-restore
FAIL: lxc-test-cloneconfig
FAIL: lxc-test-clonetest
FAIL: lxc-test-concurrent
PASS: lxc-test-config-jump-table
FAIL: lxc-test-console
FAIL: lxc-test-console-log
FAIL: lxc-test-containertests
FAIL: lxc-test-createconfig
FAIL: lxc-test-createtest
PASS: lxc-test-criu-check-feature
FAIL: lxc-test-cve-2019-5736
FAIL: lxc-test-destroytest
FAIL: lxc-test-device-add-remove
FAIL: lxc-test-exit-code
FAIL: lxc-test-get_item
PASS: lxc-test-getkeys
PASS: lxc-test-list
PASS: lxc-test-locktests
FAIL: lxc-test-lxc-attach
PASS: lxc-test-lxcpath
PASS: lxc-test-may-control
FAIL: lxc-test-mount-injection
FAIL: lxc-test-no-new-privs
PASS: lxc-test-parse-config-file
FAIL: lxc-test-proc-pid
FAIL: lxc-test-procsys
PASS: lxc-test-raw-clone
PASS: lxc-test-reboot
FAIL: lxc-test-rootfs
FAIL: lxc-test-rootfs-options
FAIL: lxc-test-saveconfig
FAIL: lxc-test-share-ns
FAIL: lxc-test-shortlived
SKIPPED: lxc-test-shutdowntest
FAIL: lxc-test-snapdeps
FAIL: lxc-test-snapshot
FAIL: lxc-test-startone
SKIPPED: lxc-test-state-server
FAIL: lxc-test-symlink
FAIL: lxc-test-sys-mixed
FAIL: lxc-test-sysctls
FAIL: lxc-test-unpriv
FAIL: lxc-test-usernic
PASS: lxc-test-usernsexec
PASS: lxc-test-utils
Results:
PASSED = 17
FAILED = 37
SKIPPED = 3
(for details check individual test log in ./logs directory)
###########log-2: ptest with patch ###################
root@intel-x86-64:/usr/lib64/lxc/ptest# ./run-ptest
### Starting LXC ptest ###
PASS: lxc-test-api-reboot
SKIPPED: lxc-test-apparmor
PASS: lxc-test-apparmor-generated
FAIL: lxc-test-apparmor-mount
PASS: lxc-test-arch-parse
PASS: lxc-test-attach
PASS: lxc-test-automount
PASS: lxc-test-autostart
PASS: lxc-test-basic
PASS: lxc-test-capabilities
PASS: lxc-test-cgpath
PASS: lxc-test-checkpoint-restore
FAIL: lxc-test-cloneconfig
PASS: lxc-test-clonetest
PASS: lxc-test-concurrent
PASS: lxc-test-config-jump-table
PASS: lxc-test-console
PASS: lxc-test-console-log
PASS: lxc-test-containertests
PASS: lxc-test-createconfig
PASS: lxc-test-createtest
PASS: lxc-test-criu-check-feature
PASS: lxc-test-cve-2019-5736
PASS: lxc-test-destroytest
PASS: lxc-test-device-add-remove
PASS: lxc-test-exit-code
FAIL: lxc-test-get_item
PASS: lxc-test-getkeys
PASS: lxc-test-list
PASS: lxc-test-locktests
PASS: lxc-test-lxc-attach
PASS: lxc-test-lxcpath
PASS: lxc-test-may-control
PASS: lxc-test-mount-injection
FAIL: lxc-test-no-new-privs
PASS: lxc-test-parse-config-file
PASS: lxc-test-proc-pid
PASS: lxc-test-procsys
PASS: lxc-test-raw-clone
PASS: lxc-test-reboot
PASS: lxc-test-rootfs
PASS: lxc-test-rootfs-options
PASS: lxc-test-saveconfig
PASS: lxc-test-share-ns
PASS: lxc-test-shortlived
SKIPPED: lxc-test-shutdowntest
FAIL: lxc-test-snapdeps
PASS: lxc-test-snapshot
PASS: lxc-test-startone
SKIPPED: lxc-test-state-server
PASS: lxc-test-symlink
PASS: lxc-test-sys-mixed
PASS: lxc-test-sysctls
FAIL: lxc-test-unpriv
FAIL: lxc-test-usernic
PASS: lxc-test-usernsexec
PASS: lxc-test-utils
Results:
PASSED = 47
FAILED = 7
SKIPPED = 3
(for details check individual test log in ./logs directory)
Xiangyu Chen (1):
lxc: templates/lxc-busybox.in: if busybox contains init then use it
...box-contains-init-use-it-in-containe.patch | 45 +++++++++++++++++++
recipes-containers/lxc/lxc_git.bb | 1 +
2 files changed, 46 insertions(+)
create mode 100644 recipes-containers/lxc/files/0001-template-if-busybox-contains-init-use-it-in-containe.patch
--
2.34.1
--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II
--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II