k3s/docs/adrs/cert-expiry-checks.md

1.7 KiB

Add Support for Checking and Alerting on Certificate Expiry

Date: 2024-03-26

Status

Accepted

Context

The certificates generated by K3s have two lifecycles:

  • Certificate authority certificates expire 3650 days (roughly 10 years) from their moment of issuance. The CA certificates are not automatically renewed, and require manual intervention to extend their validity.
  • Leaf certificates (client and server certs) expire 365 days (roughly 1 year) from their moment of issuance. The certificates are automatically renewed if they are within 90 days of expiring at the time K3s starts.

K3s does not currently expose any information about certificate validity. There are no metrics, CLI tools, or events that an administrator can use to track when certificates must be renewed or rotated to avoid outages when certificates expire. The best we can do at the moment is recommend that administrators either restart their nodes regularly to ensure that certificates are renewed within the 90 day window, or manually rotate their certs yearly.

We do not have any guidance around renewing the CA certs, which will be a major undertaking for users as their clusters approach the 10-year mark. We currently have a bit of runway on this issue, as K3s has not been around for 10 years.

Decision

  • K3s will add a CLI command to print certificate validity. It will be grouped alongside the command used to rotate the leaf certificates (k3s certificate rotate).
  • K3s will add an internal controller that maintains metrics for certificate expiration, and creates Events when certificates are about to or have expired.

Consequences

This will require additional documentation, CLI subcommands, and QA work to validate the process steps.