Depurando falhas do kernel usando o kdump

Introduction

kdump is a service that creates crash dumps when there is a kernel crash. It uses kexec(8) to boot into a secondary kernel (known as a capture kernel), then exports the contents of the kernel’s memory (known as a crash dump or vmcore) to the filesystem. The contents of vmcore can then be analyzed to root cause the kernel crash.

Configuring kdump requires setting the crashkernel kernel argument and enabling the kdump systemd service. Memory must be reserved for the crash kernel during booting of the first kernel. crashkernel=auto generally doesn’t reserve enough memory on Fedora CoreOS, so it is recommended to specify crashkernel=300M.

By default, the vmcore will be saved in /var/crash. It is also possible to write the dump to some other location on the local system or to send it over the network by editing /etc/kdump.conf. For additional information, see kdump.conf(5) and the comments in /etc/kdump.conf and /etc/sysconfig/kdump.

Configuring kdump via Ignition

Example kdump configuration
variant: fcos
version: 1.4.0
kernel_arguments:
  should_exist:
  - 'crashkernel=300M'
systemd:
  units:
  - name: kdump.service
    enabled: true

Configuring kdump after initial provision

  1. Set the crashkernel kernel argument

    sudo rpm-ostree kargs --append='crashkernel=300M'

    Mais informações sobre como modificar kargs via rpm-ostree.

  2. Habilite o serviço systemd kdump.

    sudo systemctl enable kdump.service
  3. Reinicialize seu sistema.

    sudo systemctl reboot
É altamente recomendável testar a configuração após configurar o serviço kdump, com atenção extra para a quantidade de memória reservada para o kernel de travamento. Para informações sobre como testar se o kdump está devidamente armado e como analisar o despejo, consulte a documentação de kdump para Fedora e a documentação do kernel Linux sobre kdump.