Allows nodes to join the cluster during a webhook outage. This also
enhances auditability by creating Kubernetes events for the deferred
verification.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Move coverage writer into agent and server
* Add coverage report to E2E PR tests
* Add codecov upload to drone
Signed-off-by: Derek Nola <derek.nola@suse.com>
Only actual admin actions should use the admin kubeconfig; everything done by the supervisor/deploy/helm controllers will now use a distinct account for audit purposes.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
As per https://github.com/golang/go/issues/47001 even subtle.ConstantTimeCompare should never be used with variable-length inputs, as it will return 0 if the lengths do not match. Switch to consistently using constant-time comparisons of hashes for password checks to avoid any possible side-channel leaks that could be combined with other vectors to discover password lengths.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Several places in the code used a 5-second retry loop to wait on
Runtime.Core to be set. This caused a race condition where OnChange
handlers could be added after the Wrangler shared informers were already
started. When this happened, the handlers were never called because the
shared informers they relied upon were not started.
Fix that by requiring anything that waits on Runtime.Core to run from a
cluster controller startup hook that is guaranteed to be called before
the shared informers are started, instead of just firing it off in a
goroutine that retries until it is set.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes an issue where CRDs were being created without schema, allowing
resources with invalid content to be created, later stalling the
controller ListWatch event channel when the invalid resources could not
be deserialized.
This also requires moving Addon GVK tracking from a status field to
an annotation, as the GroupVersionKind type has special handling
internal to Kubernetes that prevents it from being serialized to the CRD
when schema validation is enabled.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
We need to send the full chain in order for cross-signing to work
properly during switchover to a new root.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Turns out etcd-only nodes were never running **any** of the controllers,
so allowing multiple controllers didn't really fix things.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This command must be run on a server while the service is running. After this command completes, all the servers in the cluster should be restarted to load the new CA files.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Having separate tokens for server and agent nodes is a nice feature.
However, passing server's plain `K3S_AGENT_TOKEN` value
to `k3s agent --token` without CA hash is insecure when CA is
self-signed, and k3s warns about it in the logs:
```
Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash.
Use the full token from the server's node-token file to enable Cluster CA validation.
```
Okay so I need CA hash but where should I get it?
This commit attempts to fix this issue by saving agent token value to
`agent-token` file with CA hash appended.
Signed-off-by: Vladimir Kochnev <hashtable@yandex.ru>
Requires tweaking existing method signature to allow specifying whether or not IPv6 addresses should be return URL-safe.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Move startup hooks wg into a runtime pointer, check before notifying systemd
* Switch default systemd notification to server
* Add 1 sec delay to allow etcd to write to disk
Signed-off-by: Derek Nola <derek.nola@suse.com>
This parameter controls which namespace the klipper-lb pods will be create.
It defaults to kube-system so that k3s does not by default create a new
namespace. It can be changed if users wish to isolate the pods and apply
some policy to them.
Signed-off-by: Darren Shepherd <darren@acorn.io>
This gives nicer errors from Kubernetes components during startup, and
reduces LOC a bit by using the upstream responsewriters module instead
of writing the headers and body by hand.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This controller only needs to run when using managed etcd, so move it in
with the rest of the etcd stuff. This change also modifies the
controller to only watch the Kubernetes service endpoint, instead of
watching all endpoints in the entire cluster.
Fixes an error message revealed by use of a newer grpc client in
Kubernetes 1.24, which logs an error when the Put to etcd failed because
kine doesn't support the etcd Put operation. The controller shouldn't
have been running without etcd in the first place.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Improve feedback when running secrets-encrypt commands on etcd-only nodes, and
allow etcd-only nodes to properly restart when effecting rotation.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
This allows secondary etcd nodes to bootstrap the kubelet before an
apiserver joins the cluster. Rancher waits for all the etcd nodes to
come up before adding the control-plane nodes, so this needs to be
handled properly.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder
Signed-off-by: Derek Nola <derek.nola@suse.com>
Required to support apiextensions.v1 as v1beta1 has been deleted. Also
update helm-controller and dynamiclistener to track wrangler versions.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>