2022-12-19 22:49:56 +00:00
|
|
|
# Support CA Certificate Renewal / Rotation, Signing by External Root
|
|
|
|
|
|
|
|
Date: 2022-12-19
|
|
|
|
|
|
|
|
## Status
|
|
|
|
|
|
|
|
Accepted
|
|
|
|
|
|
|
|
## Context
|
|
|
|
|
|
|
|
On the first startup of a new cluster, K3s currently autogenerates a number of self-signed cluster CAs and keys:
|
|
|
|
* Cluster Server CA + Key (used to sign server certificates)
|
|
|
|
* Cluster Client CA + Key (used to sign client certificates)
|
|
|
|
* Request Header CA + Key (used to sign certificates for apiserver aggregation)
|
|
|
|
* etcd Peer CA + Key (used to sign certificates for authentication between etcd peer servers)
|
|
|
|
* etcd Client CA + Key (used to sign certificates for etcd clients, ie the apiserver)
|
|
|
|
* ServiceAccount Token Signing Key (used to sign ServiceAccount JSON Web Tokens)
|
|
|
|
|
|
|
|
These CAs are all self-signed, without any cross-signing or common root or intermediates, and are valid for 10
|
|
|
|
years. When any of these certs expire, any certificates issued will be invalid, causing a significant outage
|
|
|
|
to the cluster.
|
|
|
|
|
|
|
|
### Server CA Pinning
|
|
|
|
|
|
|
|
The Cluster Server CA is used in node bootstrapping. The full `K10` format token includes a SHA265 sum of the
|
|
|
|
Cluster Server CA file's on-disk PEM representation. Nodes that join the cluster using a full token perform a
|
|
|
|
set of checks when starting up:
|
|
|
|
1. Download the cluster server CA bundle from `/v1-k3s/cacert` on the server they are joining.
|
|
|
|
2. Validate that the hash of the bytes in the CA bundle match the hash string following the `K10` prefix in the
|
|
|
|
token.
|
|
|
|
3. Validate that the certificate presented by the server they are joining can be validated using the roots and
|
|
|
|
intermediates present in the CA bundle.
|
|
|
|
|
|
|
|
Realistically, this hash should have instead been derived from the DER encoding of the root certificate in
|
|
|
|
that bundle, as PEM format allows for variable padding, line lengths, and so on. Only DER format is guaranteed
|
|
|
|
to be stable, and hashing only the root of the chain would have allowed for rotating or renewing intermediate
|
|
|
|
CAs without breaking trust between cluster nodes.
|
|
|
|
|
|
|
|
### Bootstrap Data Immutability
|
|
|
|
|
|
|
|
There is not currently any way to write new certificates to the datastore. The certificates and keys are
|
|
|
|
written to disk once on initial startup, and from there written to the cluster datastore. From that point on,
|
|
|
|
the files in the datastore are considered authoritative; replacing the files on disk will result in either
|
|
|
|
replacement, or error, depending on whether or not the files on disk are newer than those in the datastore.
|
|
|
|
|
|
|
|
The `secrets-encrypt` subcommand does currently mutate the bootstrap data, but it only touches the secrets
|
|
|
|
encryption configuration, not the CA certs or keys.
|
|
|
|
|
|
|
|
### Summary
|
|
|
|
|
|
|
|
For both of the above reasons (hash pinning, and lack of rewriteability) it is not currently possible to
|
|
|
|
renew or replace the cluster CA certs or keys.
|
|
|
|
|
|
|
|
### Additional Considerations
|
|
|
|
|
|
|
|
#### Aggressive Certificate Rotation
|
|
|
|
|
|
|
|
Some users (particularly government or financial customers) attempt to implement the guidance from [NIST SP 800-57
|
|
|
|
Part 1 Rev. 5](https://csrc.nist.gov/publications/detail/sp/800-57-part-1/rev-5/final). This document would
|
|
|
|
see users signing cluster CAs with a set of organizational root and intermediate certificates, and rotating
|
|
|
|
both the intermediate and cluster CA certificates and keys on at least a yearly basis.
|
|
|
|
|
|
|
|
#### ServiceAccount Signing Key Rolling Replacement
|
|
|
|
|
|
|
|
While the ServiceAccount signing key is not signed by any CA, rotation of the key must be done carefully so
|
|
|
|
as to avoid causing an outage. The apiserver and controller-manager must be updated to use a new key, while
|
|
|
|
still trusting the old key for a period of time. The old key can then be removed at a later date, once all
|
|
|
|
clients using tokens signed by the old key have received new tokens.
|
|
|
|
|
|
|
|
## Decision
|
|
|
|
|
|
|
|
* K3s will allow for use of CA certificates signed by an arbitrary set of external root/intermediate CAs.
|
2023-01-13 00:09:23 +00:00
|
|
|
* K3s will allow for non-disruptive[^1] renewal or replacement of the CA certificates and keys, if the cluster was
|
2022-12-19 22:49:56 +00:00
|
|
|
originally started using user-provided certificates signed by an external CA.
|
2023-01-13 00:09:23 +00:00
|
|
|
* K3s will allow for disruptive[^2] renewal or replacement of cluster CA certificates and keys, if the cluster was
|
2022-12-19 22:49:56 +00:00
|
|
|
originally started with autogenerated self-signed CAs.
|
|
|
|
* K3s will provide example tooling to allow users to generate cluster CA certificates and keys prior to initial
|
|
|
|
cluster startup, and provide tooling and process documentation to update the bootstrap data and prepare agents
|
|
|
|
to trust the new certificates (if necessary)
|
|
|
|
|
2023-01-13 00:09:23 +00:00
|
|
|
[^1]: Non-disruptive renewal requires no change to node configuration. The service only needs to be restarted.
|
|
|
|
[^2]: Disruptive renewal requires changes to the K3s CLI flags, configuration file, or environment variables
|
|
|
|
prior to restarting the service. Additionally, the cluster may experience a temporary outage while the
|
|
|
|
configuration change has been affected to all nodes, due to cluster nodes temporary not sharing a common
|
|
|
|
root of trust.
|
|
|
|
|
2022-12-19 22:49:56 +00:00
|
|
|
## Consequences
|
|
|
|
|
|
|
|
This will require additional documentation, CLI subcommands, and QA work to validate the process steps.
|