Commit Graph

134 Commits

Author SHA1 Message Date
Vladimir Kochnev
13af0b1d88 Save agent token to /var/lib/rancher/k3s/server/agent-token
Having separate tokens for server and agent nodes is a nice feature.

However, passing server's plain `K3S_AGENT_TOKEN` value
to `k3s agent --token` without CA hash is insecure when CA is
self-signed, and k3s warns about it in the logs:

```
Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash.
Use the full token from the server's node-token file to enable Cluster CA validation.
```

Okay so I need CA hash but where should I get it?

This commit attempts to fix this issue by saving agent token value to
`agent-token` file with CA hash appended.

Signed-off-by: Vladimir Kochnev <hashtable@yandex.ru>
2022-08-01 14:11:50 -07:00
Brad Davidson
5eaa0a9422 Replace getLocalhostIP with Loopback helper method
Requires tweaking existing method signature to allow specifying whether or not IPv6 addresses should be return URL-safe.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-07-21 16:51:57 -07:00
Derek Nola
a9b5a1933f
Delay service readiness until after startuphooks have finished (#5649)
* Move startup hooks wg into a runtime pointer, check before notifying systemd
* Switch default systemd notification to server
* Add 1 sec delay to allow etcd to write to disk
Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-06-15 09:00:52 -07:00
Darren Shepherd
e6009b1edf Introduce servicelb-namespace parameter
This parameter controls which namespace the klipper-lb pods will be create.
It defaults to kube-system so that k3s does not by default create a new
namespace. It can be changed if users wish to isolate the pods and apply
some policy to them.

Signed-off-by: Darren Shepherd <darren@acorn.io>
2022-06-14 15:48:58 -07:00
Brad Davidson
ce5b9347c9 Replace DefaultProxyDialerFn dialer injection with EgressSelector support
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-29 17:54:36 -07:00
Brad Davidson
3d01ca1309 Make supervisor errors parsable by Kubernetes client libs
This gives nicer errors from Kubernetes components during startup, and
reduces LOC a bit by using the upstream responsewriters module instead
of writing the headers and body by hand.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-29 09:23:37 -07:00
Brad Davidson
b12cd62935 Move IPv4/v6 selection into helpers
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson
5b2c14b123 Print a helpful error when trying to join additional servers but etcd is not in use
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson
99851b0f84 Use core constants for cert user/group values
Also update cert gen to ensure leaf certs are regenerated if other key fields change.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-15 01:02:42 -07:00
Brad Davidson
f37e7565b8 Move the apiserver addresses controller into the etcd package
This controller only needs to run when using managed etcd, so move it in
with the rest of the etcd stuff. This change also modifies the
controller to only watch the Kubernetes service endpoint, instead of
watching all endpoints in the entire cluster.

Fixes an error message revealed by use of a newer grpc client in
Kubernetes 1.24, which logs an error when the Put to etcd failed because
kine doesn't support the etcd Put operation. The controller shouldn't
have been running without etcd in the first place.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-07 11:28:15 -07:00
Brad Davidson
49544e0d49 Allow agents to query non-apiserver supervisors for apiserver endpoints
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-04-06 13:03:14 -07:00
Roberto Bonafiglia
4afeb9c5c7
Merge pull request #5325 from rbrtbnfgl/fix-etcd-ipv6-url
Fixed etcd URL in case of IPv6 address
2022-04-05 09:55:42 +02:00
Roberto Bonafiglia
dda409b041 Updated localhost address on IPv6 only setup
Signed-off-by: Roberto Bonafiglia <roberto.bonafiglia@suse.com>
2022-03-29 09:35:54 +02:00
Brad Davidson
e811689df9 Fix etcd-only secrets encryption rotation
Improve feedback when running secrets-encrypt commands on etcd-only nodes, and
allow etcd-only nodes to properly restart when effecting rotation.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-25 10:40:58 -07:00
Brad Davidson
38706eeec0 Defer ensuring node passwords on etcd-only nodes during initial cluster bootstrap
This allows secondary etcd nodes to bootstrap the kubelet before an
apiserver joins the cluster. Rancher waits for all the etcd nodes to
come up before adding the control-plane nodes, so this needs to be
handled properly.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-18 10:58:37 -07:00
Brad Davidson
a93b9b6d53 Update helm-controller
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-03-16 23:49:14 -07:00
Luther Monson
9a849b1bb7
[master] changing package to k3s-io (#4846)
* changing package to k3s-io

Signed-off-by: Luther Monson <luther.monson@gmail.com>

Co-authored-by: Derek Nola <derek.nola@suse.com>
2022-03-02 15:47:27 -08:00
Brad Davidson
5014c9e0e8 Fix adding etcd-only node to existing cluster
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 19:56:08 -08:00
Brad Davidson
a1b800f0bf Remove unnecessary copies of etcdconfig struct
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-02-28 12:05:16 -08:00
Brad Davidson
e7464a17f7 Fix use of agent creds for secrets-encrypt and config validate
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-01-06 12:55:18 -08:00
Derek Nola
bcb662926d
Secrets-encryption rotation (#4372)
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-12-07 14:31:32 -08:00
Brad Davidson
3da1bb3af2 Fix other uses of NewForConfigOrDie in contexts where we could return err
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-29 15:18:14 -07:00
Brad Davidson
5a923ab8dc Add containerd ready channel to delay etcd node join
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-14 14:03:52 -07:00
Chris Kim
928b8531c3
[master] Add etcd-member-management controller to K3s (#4001)
* Initial leader elected etcd member management controller
* Bump etcd to v3.5.0-k3s2

Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-09-14 08:20:38 -07:00
Brad Davidson
29c8b238e5 Replace klog with non-exiting fork
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-10 09:36:16 -07:00
Brad Davidson
e95b75409a Fix lint failures
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Brad Davidson
dc14f370c4 Update wrangler to v0.8.5
Required to support apiextensions.v1 as v1beta1 has been deleted. Also
update helm-controller and dynamiclistener to track wrangler versions.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Brad Davidson
c434db7cc6 Wrap errors in runControllers for additional context
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-08-20 18:47:16 -07:00
Brad Davidson
869b98bc4c Sync DisableKubeProxy into control struct
Sync DisableKubeProxy from cfg into control before sending control to clients,
as it may have been modified by a startup hook.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-07-30 12:26:50 -07:00
Brad Davidson
90445bd581
Wait until server is ready before configuring kube-proxy (#3716)
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-07-27 14:56:05 -07:00
Luther Monson
37fcb61f5e move go routines for api server ready beneath wait group
Signed-off-by: Luther Monson <luther.monson@gmail.com>
2021-07-20 17:36:34 -07:00
Luther Monson
18bc98f60c
adding startup hooks args to access to Disables and Skips (#3674)
Signed-off-by: Luther Monson <luther.monson@gmail.com>
2021-07-20 05:24:52 +02:00
Jamie Phillips
aef8a6aafd
Adding support for waitgroup to the Startuphooks (#3654)
The startup hooks where executing after the deploy controller. We needed the deploy controller to wait until the startup hooks had completed.
2021-07-15 19:28:47 -07:00
Joe Kralicky
a84c75af62 Adds a command-line flag '--disable-helm-controller' that will disable
the server's built-in helm controller.

Problem:
Testing installation and uninstallation of the Helm Controller on k3s is
not possible if the Helm Controller is baked into the k3s server.

Solution:
The Helm Controller can optionally be disabled, which will allow users
to manage its installation manually.

Signed-off-by: Joe Kralicky <joe.kralicky@suse.com>
2021-06-25 14:54:36 -04:00
Hussein Galal
136dddca11
Fix storing bootstrap data with empty token string (#3422)
* Fix storing bootstrap data with empty token string

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* delete node password secret after restoration

fixes to bootstrap key

vendor update

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix comment

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix typo

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* typos

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Removing dynamic listener file after restoration

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* go mod tidy

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-06-22 22:42:34 +02:00
Brad Davidson
a7d1159ba6 Emit events for AddOn lifecycle
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-06-11 14:00:27 -07:00
Brad Davidson
6e48ca9b53 Tidy up function calls with many args
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-06-11 14:00:27 -07:00
Brad Davidson
6ef000091a Add nodename to UA string for deploy controller
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-06-10 17:05:52 -07:00
Brad Davidson
02a5bee62f
Add system-default-registry support and remove shared code (#3285)
* Move registries.yaml handling out to rancher/wharfie
* Add system-default-registry support
* Add CLI support for kubelet image credential providers

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-05-10 15:58:41 -07:00
Brian Downs
c5ad71ce0b
Collect and Store etcd Snapshots and Metadata (#3239)
* Add the ability to store local etcd snapshots and etcd snapshots stored in an S3 compatible object store in a ConfigMap.
2021-04-30 18:26:39 -07:00
Brad Davidson
2705431d96
Add support for dual-stack Pod/Service CIDRs and node IP addresses (#3212)
* Add support for dual-stack cluster/service CIDRs and node addresses

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-04-21 15:56:20 -07:00
Chris Kim
69f96d6225
Define a Controllers and LeaderControllers on the server config (#3043)
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-03-11 10:39:00 -08:00
Brad Davidson
7cdfaad6ce
Always use static ports for client load-balancers (#3026)
* Always use static ports for the load-balancers

This fixes an issue where RKE2 kube-proxy daemonset pods were failing to
communicate with the apiserver when RKE2 was restarted because the
load-balancer used a different port every time it started up.

This also changes the apiserver load-balancer port to be 1 below the
supervisor port instead of 1 above it. This makes the apiserver port
consistent at 6443 across servers and agents on RKE2.

Additional fixes below were required to successfully test and use this change
on etcd-only nodes.

* Actually add lb-server-port flag to CLI
* Fix nil pointer when starting server with --disable-etcd but no --server
* Don't try to use full URI as initial load-balancer endpoint
* Fix etcd load-balancer pool updates
* Update dynamiclistener to fix cert updates on etcd-only nodes
* Handle recursive initial server URL in load balancer
* Don't run the deploy controller on etcd-only nodes
2021-03-06 02:29:57 -08:00
Erik Wilson
4e5218b62c
Apply suggestions from code review
Logging cleanup

Co-authored-by: Brad Davidson <brad@oatmail.org>
2021-03-01 10:44:24 -07:00
Erik Wilson
54a35505f0
Remove Traefik v1 migration 2021-03-01 10:44:24 -07:00
Chin-Ya Huang
cc96f8140a
Allow download traefik static file and rename
Allow writing static files regardless of the version.

Signed-off-by: Chin-Ya Huang <chin-ya.huang@suse.com>
2021-03-01 10:44:24 -07:00
Chin-Ya Huang
10e0328977
Traefik v2 integration
K3s upgrade via watch over file change of static file and manifest
and triggers helm-controller for change. It seems reasonable to
only allow upgrade traefik v1->v2 when there is no existing custom
traefik HelmChartConfig in the cluster to avoid any
incompatibility.

Here also separate the CRDs and put them into a different chart
to support CRD upgrade.

Signed-off-by: Chin-Ya Huang <chin-ya.huang@suse.com>
2021-03-01 10:44:23 -07:00
Hussein Galal
5749f66aa3
Add disable flags for control components (#2900)
* Add disable flags to control components

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* golint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes to disable flags

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Add comments to functions

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Fix joining problem

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* golint

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix ticker

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix role labels

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-02-12 17:35:57 +02:00
Brad Davidson
6e768c301e Use appropriate response codes for authn/authz failures
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-02-09 16:28:20 -08:00
Brian Downs
13229019f8
Add ability to perform an etcd on-demand snapshot via cli (#2819)
* add ability to perform an etcd on-demand snapshot via cli
2021-01-21 14:09:15 -07:00