Skip to content

fix(backend/tty): add delayed rescan for connectors missing EDID on hotplug#3409

Open
coleleavitt wants to merge 1 commit intoniri-wm:mainfrom
coleleavitt:fix/monitor-rescan-edid-race
Open

fix(backend/tty): add delayed rescan for connectors missing EDID on hotplug#3409
coleleavitt wants to merge 1 commit intoniri-wm:mainfrom
coleleavitt:fix/monitor-rescan-edid-race

Conversation

@coleleavitt
Copy link

Problem

USB-C docks with DP MST / alt-mode (e.g. Lenovo ThinkPad USB-C Dock Gen 2) can report connectors as Connected to the kernel DRM subsystem before EDID data has been read. When this happens:

  1. connector.modes() returns an empty list
  2. pick_mode() returns None
  3. connector_connected() logs "no mode" and skips activation

Smithay's ConnectorScanner treats (Connected, Connected) as a no-op — it does not re-emit events for already-connected connectors. This means the output gets stuck in a permanent "connected but never activated" dead state.

Whether activation succeeds depends entirely on timing: if a second UdevEvent::Changed fires after EDID completes, the connector recovers. This makes the bug intermittent — monitors sometimes come up and sometimes don't on the same hardware.

Root Cause

The EDID race is in the kernel/dock firmware timing, but niri has no retry path for connectors that were connected but could not be activated due to missing modes.

Fix

Add a bounded retry mechanism in Tty:

  • After device_changed() processes connectors, schedule_rescan_if_needed() checks for connected connectors that have no matching surface (not yet activated) and are not non-desktop connectors
  • If found, schedules a calloop::Timer (2 s delay) that re-invokes device_changed(), giving the kernel time to complete EDID reads
  • Retries are capped at MAX_RESCAN_RETRIES (3) to prevent infinite rescheduling
  • The timer self-clears when all connectors are successfully activated
  • Existing timers are cancelled before scheduling new ones (cancel-and-reschedule)
  • Timers are cleaned up on device removal

New fields on OutputDevice

  • rescan_timer_token: Option<RegistrationToken> — handle to cancel pending timers
  • rescan_retry_count: u8 — bounded counter, reset on successful activation

Testing

  • Hardware: ThinkPad P16 Gen 3, NVIDIA RTX PRO 4000 (nvidia-drm), Intel iGPU (i915), Lenovo USB-C Dock Gen 2, two LEN S27q-10 monitors (HDMI + DP MST)
  • Builds cleanly (cargo check, cargo clippy, cargo build --release)
  • Follows existing calloop timer patterns used elsewhere in tty.rs (e.g. VRR timers, redraw timers)

Alternatives Considered

  • Fixing in smithay's ConnectorScanner: Would require changing the (Connected, Connected) => {} behavior to re-emit events, which could have broader implications for all smithay consumers
  • Polling loop: Rejected in favor of bounded async timer to avoid blocking the event loop
  • Infinite retries: Rejected — capped at 3 to avoid pathological cases

@coleleavitt
Copy link
Author

Upstream Root Cause & Fix

After deeper analysis, the underlying issue is in smithay's ConnectorScanner in smithay-drm-extras. The (Connected, Connected) arm is a no-op, which means mode-list changes on already-connected connectors are silently ignored. When a USB-C dock connector reports as Connected before EDID is ready (empty mode list), and a later rescan finds modes populated, no Connected event is re-emitted.

I've filed an issue and PR upstream:

Why this PR is still valuable even with the smithay fix

The smithay fix ensures that when a rescan happens and modes have appeared, the compositor gets notified. However, this niri-level rescan timer is still needed because:

  1. No guaranteed second udev event: The kernel may only fire one Changed event before EDID is ready — without a timer-based rescan, there's nothing to trigger a second ConnectorScanner::scan()
  2. Defense in depth: The timer provides a bounded retry window (3 × 2s = 6s) regardless of udev event timing
  3. Complementary: The smithay fix makes each rescan more effective (mode changes are detected), while this PR ensures rescans actually happen

…otplug

USB-C docks with DP MST/alt-mode may report connectors as Connected
before EDID data is available, causing pick_mode() to return None and
connector_connected() to skip activation. Smithay's ConnectorScanner
does not re-emit events for already-connected connectors, leaving the
output in a permanent dead state.

Add a bounded retry mechanism: after device_changed() processes
connectors, schedule_rescan_if_needed() checks for connected connectors
that have no matching surface (not yet activated). If found, it schedules
a calloop timer (2 s delay) that re-runs device_changed(), giving the
kernel time to complete EDID reads. The retry is capped at 3 attempts
and self-clears when all connectors are activated or on device removal.
@coleleavitt coleleavitt force-pushed the fix/monitor-rescan-edid-race branch from 92ea363 to 67b12ec Compare February 7, 2026 07:41
@YaLTeR
Copy link
Member

YaLTeR commented Feb 15, 2026

I suppose this needs updating to use the new Changed event instead?

@coleleavitt
Copy link
Author

I suppose this needs updating to use the new Changed event instead?

yes I'll update it today if I get time; thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants