Skip to content

Hard kernel crash: nvidia-open 580.159.03 on dual RTX 2080 Ti (kexec/kdump triggered on GPU idle transition) #1211

@superniker

Description

@superniker

Environment

  • OS: Ubuntu 26.04, kernel 7.0.0-22-generic
  • Driver: nvidia-open 580.159.03
  • GPU: 2× RTX 2080 Ti 22GB + NVLink
  • CPU: i5-12400, 48GB RAM

Behavior

System hard-crashes (kernel panic → kexec/kdump → reboot) when GPU transitions from heavy load to idle. Reproduced 3 times in 24 hours, always at the same point: after vLLM inference server finishes processing and GPU utilization drops from 100% to 0%.

No Xid errors, no coredump, no kernel log entries before crash. /var/crash/ contains only kexec_cmd and kdump_lock (0 bytes — dump not generated).

Related

Similar report with same driver version on RTX 5070: pop-os/cosmic-comp#2341

Workaround

Switching to closed-source nvidia-driver-580-server resolves the issue (suggests bug is in the open kernel module, not the driver layer).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions