-
Notifications
You must be signed in to change notification settings - Fork 75
Description
When using the UNO Q as a linux host to control a Klipper based 3D Printer all USB peripherals, connected via USB Hub, are disconnected when performing a Klipper service restart.
The xHCI host controller (xhci-hcd.2.auto, DWC3 at 0x04e00000) crashes permanently whenever a USB CDC-ACM device (STM32-based MCU running Klipper firmware) performs a hardware reset via NVIC_SystemReset(). The reset causes an abrupt USB disconnect mid-transaction. The xHCI controller attempts to stop the in-flight endpoint, hangs, and the kernel declares it dead — taking down the entire USB bus.
Devices connected
Bus 001 Device 002: ID 05e3:0610 Genesys Logic, Inc. Hub
Bus 001 Device 004: ID 1a40:0101 Terminus Technology Inc. Hub
Bus 001 Device 005: ID 1a40:0101 Terminus Technology Inc. Hub
Bus 001 Device 006: ID 04d9:8030 Holtek Semiconductor, Inc. BTT-HDMI5
Bus 001 Device 007: ID 090c:037c Silicon Motion, Inc. 300k Pixel Camera
Bus 001 Device 008: ID 1a40:0101 Terminus Technology Inc. Hub
Bus 001 Device 012: ID 1d50:614e OpenMoko, Inc. stm32f446xx (Klipper MCU)
Bus 001 Device 013: ID 1d50:614e OpenMoko, Inc. stm32g431xx (Klipper MCU)
Bus 001 Device 014: ID 1d50:614e OpenMoko, Inc. stm32h723xx (Klipper MCU)
Bus 001 Device 015: ID 1d50:614e OpenMoko, Inc. stm32g0b1xx (Klipper MCU)
dmesg
The crash is consistently triggered by one or more CDC-ACM devices disconnecting simultaneously, and always produces this sequence:
usb 1-1.3.1.1: USB disconnect, device number 10
usb 1-1.2.3: USB disconnect, device number 9
xhci-hcd xhci-hcd.2.auto: xHCI host not responding to stop endpoint command
xhci-hcd xhci-hcd.2.auto: xHCI host controller not responding, assume dead
xhci-hcd xhci-hcd.2.auto: HC died; cleaning up
usb 1-1: USB disconnect, device number 2
cdc_acm 1-1.3.1.1:1.1: acm_start_wb - usb_submit_urb(write bulk) failed: -19
usb 1-1.1: USB disconnect, device number 3
usb 2-1: USB disconnect, device number 2
usb 1-1.2: USB disconnect, device number 4
usb 1-1.2.1: USB disconnect, device number 6
usb 1-1.3: USB disconnect, device number 5
usb 1-1.3.1: USB disconnect, device number 8
usb 1-1.3.1.3: USB disconnect, device number 11
usb 1-1.4: USB disconnect, device number 7
The entire USB bus is lost. Recovery requires unbinding and rebinding the xHCI platform driver:
echo "xhci-hcd.2.auto" > /sys/bus/platform/drivers/xhci-hcd/unbind
sleep 2
echo "xhci-hcd.2.auto" > /sys/bus/platform/drivers/xhci-hcd/bind
What has been tried (does not fix the issue)
The following DWC3 device tree quirks were added to the usb@4e00000 node in qrb2210-arduino-imola.dtb — all confirmed active via /proc/device-tree — but the crash persists:
- snps,parkmode-disable-ss-quirk — added per LKML patch for QCM2290
- snps,parkmode-disable-hs-quirk
- snps,dis_u3_susphy_quirk
Also attempted:
usbcore.autosuspend=-1kernel parameter- restart_method: command in Klipper (avoids DTR toggle, but MCU still executes NVIC_SystemReset() which causes the same abrupt disconnect)
Root cause analysis
The crash occurs because NVIC_SystemReset() on the MCU kills the USB peripheral mid-transaction without a graceful USB soft-disconnect. The xHCI controller then issues a "stop endpoint" command to an endpoint that has already been
invalidated. On this SoC's DWC3/xHCI implementation, that command never completes — the controller hangs for ~5 seconds and is then declared dead by the kernel.
The trigger is not specific to Klipper. Any USB CDC-ACM device that performs an abrupt hardware reset will reproduce this. The issue is in the xHCI driver's inability to time out and recover gracefully from a "stop endpoint" command when
the device has disappeared.
Similar crashes have been observed and fixed on other platforms by adding timeout handling to the xHCI stop endpoint command path, or by improving the DWC3 error recovery path. The existing XHCI_RESET_ON_RESUME quirk and similar mechanisms do not cover the surprise-disconnect + stuck-endpoint scenario.