linux-fslc 5.4.114: rcu_preempt detected stalls on IMX7D #dunfell #meta-freescale


benson_young@...
 
Edited

Hello,

We are experiencing kernel crash when we are stress testing WiFi functionality.
please see attachment for kernel dump (near end of log)
we have read that rcu_preempt dump are sometimes caused by spin lock used by driver when kernel has PREEMPT enabled.
and suggested solution is making change as something below
spin_unlock_irq(&signal->obj.wait.lock);   --->  raw_spin_unlock_irq(&signal->obj.wait.lock);
spin_lock_irq(&signal->obj.wait.lock);   -----> raw_spin_lock_irq(&signal->obj.wait.lock);
however, the wireless driver on our platform doesn't use such calls.

The WiFi device is an RTL8822CS using SDIO interface, it is an out-of-tree driver released by Realtek.
When we applied the same wireless driver on official NXP BSPs Linux 5.10.9 (gatesgarth) and Linux 4.9.11 (morty) this issue isn't observed.
Attaching the kernel configs for the two BSPs config_5.4.114_dunfell (meta-freescale community BSP). config_5.10.9_gatesgarth (NXP
official BSP)

Can anyone please advise on possible cause and solution?

much appreciate the assistance and best regards,

Benson


Otavio Salvador
 

Hello,

Em ter., 11 de jan. de 2022 às 07:08, <benson_young@...> escreveu:
We are experiencing kernel crash when we are stress testing WiFi functionality over SDIO interface.
please see attachment for kernel dump (near end of log)
we have read that rcu_preempt dump are sometimes caused by spin lock used by driver when kernel has PREEMPT enabled.
and suggested solution is making change as something below
spin_unlock_irq(&signal->obj.wait.lock);   --->  raw_spin_unlock_irq(&signal->obj.wait.lock);
spin_lock_irq(&signal->obj.wait.lock);   -----> raw_spin_lock_irq(&signal->obj.wait.lock);
however, the wireless driver on our platform doesn't use such calls.

When we used the same wireless driver on official NXP BSP (gatesgarth), this issue isn't observed.
Attaching the kernel configs for the two BSPs.

Can anyone please advise on possible cause and solution?

much appreciate the assistance and best regards,

You'll need to compare the branches and see if it has been fixed in a newer kernel release. 

--
Otavio Salvador                             O.S. Systems
http://www.ossystems.com.br        http://code.ossystems.com.br
Mobile: +55 (53) 9 9981-7854          Mobile: +1 (347) 903-9750


Fabio Estevam
 

Hi Benson,

On Tue, Jan 11, 2022 at 7:08 AM <benson_young@...> wrote:

Hello,

We are experiencing kernel crash when we are stress testing WiFi functionality over SDIO interface.
please see attachment for kernel dump (near end of log)
Looking at your log, I see you use an rtl8822 Wifi device, which is
not supported in the mainline kernel.

I assume you use an out-of-tree driver, right?

When using out-of-tree drivers, it is common to see the driver work on
a specific kernel version and then it breaks on a kernel update.

I suggest you consider using a well-supported Wifi chip instead.

Regards,

Fabio Estevam


Fabio Estevam
 

On Tue, Jan 11, 2022 at 10:06 AM Fabio Estevam via
lists.yoctoproject.org <festevam=gmail.com@...>
wrote:

Hi Benson,

On Tue, Jan 11, 2022 at 7:08 AM <benson_young@...> wrote:

Hello,

We are experiencing kernel crash when we are stress testing WiFi functionality over SDIO interface.
please see attachment for kernel dump (near end of log)
Looking at your log, I see you use an rtl8822 Wifi device, which is
not supported in the mainline kernel.
Ah, the driver is available at drivers/net/wireless/realtek/rtw88/rtw8822c.c

I would suggest you try a more recent kernel, such as 5.15.y then.

If you still observe issues, then please report them to the maintainer
and lists given by:
./scripts/get_maintainer.pl -f drivers/net/wireless/realtek/rtw88/rtw8822c.c

Hope this helps.


Fabio Estevam
 

On Tue, Jan 11, 2022 at 11:29 PM <benson_young@...> wrote:

[Edited Message Follows]

Hello,

We are experiencing kernel crash when we are stress testing WiFi functionality.
please see attachment for kernel dump (near end of log)
we have read that rcu_preempt dump are sometimes caused by spin lock used by driver when kernel has PREEMPT enabled.
and suggested solution is making change as something below
spin_unlock_irq(&signal->obj.wait.lock); ---> raw_spin_unlock_irq(&signal->obj.wait.lock);
spin_lock_irq(&signal->obj.wait.lock); -----> raw_spin_lock_irq(&signal->obj.wait.lock);
however, the wireless driver on our platform doesn't use such calls.

The WiFi device is an RTL8822CS using SDIO interface, it is an out-of-tree driver released by Realtek.
Ok, understood. The RTL8822 driver available in the kernel is for PCI-only.

You should push Realtek to upstream RTL8822CS SDIO support as working
with an out-of-tree driver is too painful.

When we applied the same wireless driver on official NXP BSPs Linux 5.10.9 (gatesgarth) and Linux 4.9.11 (morty) this issue isn't observed.
Attaching the kernel configs for the two BSPs config_5.4.114_dunfell (meta-freescale community BSP). config_5.10.9_gatesgarth (NXP
official BSP)

Can anyone please advise on possible cause and solution?
I suggest you use a 5.10 linux-fslc kernel version.