Commit Graph

897 Commits

Author SHA1 Message Date
Derek Nola
48ffed3852
Enable logging on all subcommands (#4921)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-01-12 14:00:40 -08:00
Brad Davidson
a0cadcd343 Move ClusterResetRestore handling ControlConfig setup
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-01-12 10:46:10 -08:00
Brad Davidson
5ca206ad3b Fix handling of agent-token fallback to token
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-01-07 09:56:37 -08:00
Brad Davidson
e7464a17f7 Fix use of agent creds for secrets-encrypt and config validate
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2022-01-06 12:55:18 -08:00
Lordran
31f1a00b6f
Fix a typo: advertise-up -> advertise-ip (#4827)
Signed-off-by: 胥朝阳 <xuzhaoyang@91cyt.com>
2022-01-06 08:52:07 -08:00
Derek Nola
2ac8df3602
Integration tests utilities improvements (#4832)
* Remove sudo commands from integration tests

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Added cleanup fucntion

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Implement better int cleanup

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Rename test utils

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Enable K3sCmd to be a single string

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Removed parsePod function

Signed-off-by: Derek Nola <derek.nola@suse.com>

* codespell

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Revert startup timeout

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Reorder sonobuoy tests, drop concurrent tests to 3

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Disable etcd

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Skip parallel testing for etcd

Signed-off-by: Derek Nola <derek.nola@suse.com>
2022-01-06 08:05:56 -08:00
Luther Monson
66eeabbdfc linter doesn't actually run on windows, found these while getting it running on a windows machine
Signed-off-by: Luther Monson <luther.monson@gmail.com>
2021-12-28 20:44:21 -07:00
Derek Nola
ff49dcf71e Export default parser
Signed-off-by: Derek Nola <derek.nola@suse.com>
(cherry picked from commit 9cc930e4a3)
2021-12-22 16:06:55 -08:00
Brad Davidson
87395e32d6 Update modules for Kubernetes v1.23
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-12-22 10:47:38 -08:00
Manuel Buil
30c701f5de
Merge pull request #4796 from manuelbuil/flannel-logrus
Move flannel logs to logrus
2021-12-22 10:33:43 +01:00
Brad Davidson
a5c6e6a68a Fix panic checking name of uninitialized etcd member
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-12-21 23:38:20 -08:00
Luther Monson
02f862da5f
Merge pull request #4791 from luthermonson/vendor-rm
[master] Remove the Vendor Directory
2021-12-21 15:07:55 -07:00
Brian Downs
3ae550ae51
Update bootstrap logic to output all changed files on disk (#4800) 2021-12-21 14:28:32 -07:00
Luther Monson
e6cf8f5982 code changes to drop the vendor dir
Signed-off-by: Luther Monson <luther.monson@gmail.com>
2021-12-21 14:23:38 -07:00
Manuel Buil
4eb282edac Move flannel logs to logrus
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-12-21 14:34:51 +01:00
Hussein Galal
2e91913f54
Close agentReady channel only in k3s (#4792)
* Close agentReady channel only in k3s

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* codespell check

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-12-21 00:22:49 +02:00
Brad Davidson
8ad7d141e8 Close etcd clients to avoid leaking GRPC connections
If you don't explicitly close the etcd client when you're done with it,
the GRPC connection hangs around in the background. Normally this is
harmelss, but in the case of the temporary etcd we start up on 2399 to
reconcile bootstrap data, the client will start logging errors
afterwards when the server goes away.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-12-17 23:55:17 -08:00
Manuel Buil
588d15db8f Remove Disables, Skips and DisableKubeProxy from the comparing configs
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-12-17 19:04:38 +01:00
Brad Davidson
6f4217a340 Build standalone containerd
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-12-16 12:00:15 -08:00
Derek Nola
17eebe0563
Fix cold boot and reconcilation on secondary servers (#4747)
* Enable reconcilation on secondary servers

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Remove unused code

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Attempt to reconcile with datastore first

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Added warning on failure

Signed-off-by: Derek Nola <derek.nola@suse.com>

* Update warning

Signed-off-by: Derek Nola <derek.nola@suse.com>

* golangci-lint fix

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-12-15 15:38:50 -08:00
Hussein Galal
d71b335871
Fix snapshot restoration on fresh nodes (#4737)
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-12-14 02:04:39 +02:00
Brian Downs
bf4e037fcf
Resolve Bootstrap Migration Edge Case (#4730) 2021-12-13 13:02:30 -07:00
Brian Downs
a6fe2c0bc5
Resolve restore bootstrap (#4704) 2021-12-09 14:54:27 -07:00
Brad Davidson
a70487d5ae Update wharfie usage in windows code path
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-12-09 13:16:22 -08:00
Hussein Galal
3985fd0e26
[master] Add validation to certificate rotation (#4692)
* Add validation to certificate rotation

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* Add validation to certificate rotation

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-12-09 18:57:13 +02:00
Manuel Buil
1e0696628e
Merge pull request #4581 from manuelbuil/checking-HA-parameters
Verify new control plane nodes joining the cluster share the same config as cluster members
2021-12-08 10:49:28 +01:00
Alexey Medvedchikov
8f389ab030
Include node-external-ip in serving-kubelet.crt SANs (#4620)
* Include node-external-ip in serving-kubelet.crt SANs

Signed-off-by: Alexey Medvedchikov <alexeymedvedchikov@improbable.io>
2021-12-07 15:42:40 -08:00
Derek Nola
bcb662926d
Secrets-encryption rotation (#4372)
* Regular CLI framework for encrypt commands
* New secrets-encryption feature
* New integration test
* fixes for flaky integration test CI
* Fix to bootstrap on restart of existing nodes
* Consolidate event recorder

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-12-07 14:31:32 -08:00
Manuel Buil
1b3187ea07 Check HA network parameters
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-12-07 23:09:05 +01:00
Brad Davidson
7d3447ceff Bump wharfie to v0.5.1 and use shared decompression code
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-12-07 12:50:57 -08:00
Hussein Galal
77fd3e99ec
Add cert rotation command (#4495)
* Add cert rotation command

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* add function to check for dynamic listener file

Signed-off-by: Brian Downs <brian.downs@gmail.com>

* Add dynamiclistener cert rotation support

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes to the cert rotation

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix ci tests

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fixes to certificate rotation command

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* more fixes

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

Co-authored-by: Brian Downs <brian.downs@gmail.com>
2021-12-02 23:19:16 +02:00
Manuel Buil
8141a933b0
Merge pull request #4550 from manuelbuil/improve_flannel_logging
Improve flannel code and logging
2021-12-01 18:22:23 +01:00
Derek Nola
d05c334a78
Improved cleanup for etcd unit test (#4537)
* Improved cleanup for etcd unit test

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-11-29 14:46:58 -08:00
Chris Kim
ae4a1a144a
etcd snapshot functionality enhancements (#4453)
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-11-29 10:30:04 -08:00
Brad Davidson
0c1f816f24 go generate
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-11-23 16:38:55 -08:00
Manuel Buil
7685da3e24 Improve flannel logging
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-11-22 21:51:52 +01:00
Hussein Galal
03485632ea
Fix regression with cluster reset (#4521)
* Fix regression with cluster reset

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* typo

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-11-17 23:22:18 +02:00
Derek Nola
ef263bd2b0
Improved regex for double equals arguments (#4505)
Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-11-16 11:16:13 -08:00
Derek Nola
535a919635
Removed value from warning about skipping flags (#4491)
* Enabled skipping of unkown flags from config in parser
* Added new unit test, expanded existing
* Add warning back in, without value

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-11-15 13:17:10 -07:00
Chris Kim
f18b3252c0
[master] Add etcd extra args support for K3s (#4463)
* Add etcd extra args support for K3s

Signed-off-by: Chris Kim <oats87g@gmail.com>

* Add etcd custom argument integration test

Signed-off-by: Chris Kim <oats87g@gmail.com>

* go generate

Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-11-11 21:03:15 -08:00
Thorsten Klein
41ff19de71 Feature: Add CoreDNS Customization Options
Problem:
Before, to customize CoreDNS, one had to edit the default configmap,
which gets re-written on every K3s server restart.

Solution:
Mount an additional coredns-custom configmap into the CoreDNS container
and import overrides and additional server blocks from the included
files.

Signed-off-by: Thorsten Klein <iwilltry42@gmail.com>
2021-11-11 18:41:22 -08:00
Derek Nola
4b57951fb0
Fix to allow etcd-snapshot to use config file with flags that are only used with k3s server. (#4464)
* Enabled skipping of unknown flags from config in parser
* Added new unit test, expanded existing

Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-11-11 16:01:23 -08:00
Brad Davidson
5ab6d21a7d
Increase agent's apiserver ready timeout (#4454)
Since we now start the server's agent sooner and in the background, we
may need to wait longer than 30 seconds for the apiserver to become
ready on downstream projects such as RKE2.

Since this essentially just serves as an analogue for the server's
apiReady channel, there's little danger in setting it to something
relatively high.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-11-11 14:01:49 -07:00
Brad Davidson
bc7cdc78ca go generate
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-11-10 17:36:01 -08:00
Manuel Buil
8271d98a76
Merge pull request #4437 from manuelbuil/fix_svclb_ipv6_rh
Allow svclb pod to enable ipv6 forwarding
2021-11-10 19:08:40 +01:00
Manuel Buil
5d168a1d59 Allow svclb pod to enable ipv6 forwarding
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-11-10 18:20:03 +01:00
Brian Downs
adaeae351c
update bootstrap logic (#4438)
* update bootstrap logic resolving a startup bug and account for etcd
2021-11-10 05:33:42 -07:00
Derek Nola
7bd65047c3
Match to last After keyword for parser (#4383)
* Made parser able to skip over subcommands
* Edge case coverage, reworked regex with groups
Signed-off-by: Derek Nola <derek.nola@suse.com>
2021-11-08 10:54:48 -08:00
Luther Monson
36c6634cce
[master] updating to new signals package in wrangler (#4399)
* updating to new signals package in wrangler

Signed-off-by: Luther Monson <luther.monson@gmail.com>
2021-11-08 08:32:43 -07:00
Brad Davidson
f7dcc139ff Bump klipper-lb image for arm fix
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-11-02 18:55:09 -07:00
Deshi Xiao
f1622129e4 refactor: Use plain channel send or receive
fix issue #4369

should use a simple channel send/receive instead of select with a single
case

Signed-off-by: Deshi Xiao <xiaods@gmail.com>
2021-11-01 15:00:49 -07:00
Brad Davidson
f9f1cabe9c Fix log/reap reexec
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-11-01 14:24:14 -07:00
Jacob Blain Christen
702fe24afe
containerd/cri: enable the btrfs snapshotter (#4316)
* vendor: btrfs
* enable the btrfs snapshotter
* testing: snapshotter/btrfs

Signed-off-by: Jacob Blain Christen <jacob@rancher.com>
2021-10-29 23:31:33 -07:00
Brad Davidson
3da1bb3af2 Fix other uses of NewForConfigOrDie in contexts where we could return err
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-29 15:18:14 -07:00
Brad Davidson
5acd0b9008 Watch the local Node object instead of get/sleep looping
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-29 15:18:14 -07:00
Brad Davidson
3fe460d080 Block scheduler startup on untainted node when using embedded CCM
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-29 15:18:14 -07:00
Derek Nola
7c3f21e581
K3s Integration test fixes (#4341)
* Move tests into sub folders
* Updated documentation
* Prevent infinite loop is user has not made k3s

Signed-off-by: dereknola <derek.nola@suse.com>
2021-10-28 12:35:28 -07:00
galal-hussein
ab3d25a2c5 Update peer address when running cluster-reset
Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-10-25 15:43:27 -07:00
Brian Downs
0a0b915921
reset buffer after use (#4279) 2021-10-22 15:56:01 -07:00
Derek Nola
918945da45
Added configuration input to etcd-snapshot (#4280)
Signed-off-by: dereknola <derek.nola@suse.com>
2021-10-22 12:03:32 -07:00
Brian Downs
e11a4bf8bb
set duration to second (#4231) 2021-10-15 16:46:39 -07:00
Brian Downs
0452f017c1
Add etcd s3 timeout (#4207) 2021-10-15 10:24:14 -07:00
Brian Downs
34080b23b1
Copy old bootstrap buffer data for use during migration (#4215) 2021-10-15 10:17:29 -07:00
Manuel Buil
dbc14b8990 Fix race condition in cloud provider
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-10-15 13:28:32 +02:00
Brad Davidson
5a923ab8dc Add containerd ready channel to delay etcd node join
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-14 14:03:52 -07:00
Hussein Galal
b282528ee2
Display cluster tls error only in debug mode (#4124)
* Display cluster tls error only in debug mode

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-10-13 00:00:28 +02:00
Brad Davidson
dc18ef2e51 Refactor log and reaper exec to omit MAINPID
Using MAINPID breaks systemd's exit detection, as it stops watching the
original pid, but is unable to watch the new pid as it is not a child
of systemd itself. The best we can do is just notify when execing the child
process.

We also need to consolidate forking into a sigle place so that we don't
end up with multiple levels of child processes if both redirecting log
output and reaping child processes.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-12 13:35:10 -07:00
Derek Nola
feec44572d
Improve error message when using a "K10" prefixed token (#4180)
* Add new error message with a K10 prefixed secret token

Signed-off-by: dereknola <derek.nola@suse.com>
2021-10-11 10:00:22 -07:00
Brian Downs
ac7a8d89c6
Add ability to reconcile bootstrap data between datastore and disk (#3398) 2021-10-07 12:47:00 -07:00
Derek Nola
b6919adf62
Add "etcd-" prefix to etcd-snapshot commands as aliases (#4161)
* Add "etcd-" prefix to etcd-snapshot commands as alias

Signed-off-by: dereknola <derek.nola@suse.com>
2021-10-06 14:20:22 -07:00
Manuel Buil
635f790eb4
Merge pull request #4114 from manuelbuil/lb-controller-dual-stack
Dual-stack support in serviceLB controller
2021-10-06 16:08:10 +02:00
Manuel Buil
00cf4578ec Dual-stack support LB controller
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-10-06 11:06:20 +02:00
Marc Bachmann
9b35734e1a Add topologySpreadConstraints to support scaling of coredns
Signed-off-by: Marc Bachmann <marc.brookman@gmail.com>
2021-10-05 11:52:44 -07:00
Brad Davidson
12e675e2cc Don't evacuate the root cgroup when rootless
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-10-01 16:18:12 -07:00
Brad Davidson
5d1a37ee32 Send MAINPID to systemd when reexecing for logfile output
This allows the new process to notify systemd when it is ready.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-29 11:41:09 -07:00
Brad Davidson
a16105b348 Properly handle operation as init process
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-28 11:05:34 -07:00
Brian Downs
f4cea90cb9
set transport to skip verify if se skip flag passed (#4102) 2021-09-28 10:13:50 -07:00
Manuel Buil
87524a7ac7 Enable the inheritance of settings for ipv6
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-09-28 09:42:08 +02:00
Michal Rostecki
47676eff78
Merge pull request #4080 from manuelbuil/update_klipperlb2
Use the new klipper-lb image that has newer go and Alpine versions
2021-09-27 10:11:52 +02:00
Brad Davidson
73e21e739f Drop broken SupportNoneCgroupDriver support
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-23 16:12:51 -07:00
Manuel Buil
b99b943c17 Use the new klipper-lb image that has newer go and Alpine versions
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-09-22 09:23:38 +02:00
Brad Davidson
28be0de4e8 Revert "Use the newer klipper-lb image"
This reverts commit 1d21491094.
2021-09-20 13:19:38 -07:00
Brad Davidson
64b502e92c Disable automounting service account token in servicelb pods
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-17 15:52:44 -07:00
Hussein Galal
7826407a2e
Make sure there are no duplicates in etcd member list (#4025)
* Make sure there are no duplicates in etcd member list

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* fix node names with hyphens

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>

* use full server name for etcd node name

Signed-off-by: galal-hussein <hussein.galal.ahmed.11@gmail.com>
2021-09-18 00:51:18 +02:00
Manuel Buil
1d21491094 Use the newer klipper-lb image
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-09-17 15:42:48 -07:00
Brad Davidson
753e11ee3c Enable JobTrackingWithFinalizers FeatureGate
Works around issue with Job controller not tracking job pods that
are in CrashloopBackoff during upgrade from 1.21 to 1.22.

Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-17 11:26:45 -07:00
Derek Nola
eda65b19d9
Remove expiremental from cluster commands (#4024)
Signed-off-by: dereknola <derek.nola@suse.com>
2021-09-15 16:41:50 -07:00
Joe Kralicky
debb508643
Nvidia container runtime discovery in containerd config template (#3890)
* Update the default containerd config template with support for adding extra container runtimes. Add logic to discover nvidia container runtimes installed via the the gpu operator or package manager.

Signed-off-by: Joe Kralicky <joe.kralicky@suse.com>
2021-09-15 14:31:11 -07:00
Brad Davidson
086ca8ba6a Fix premature etcd shutdown when joining an existing cluster
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-15 10:35:07 -07:00
Manuel Buil
60cd86bc42
Merge pull request #3906 from manuelbuil/dual-stack
Add dual-stack support on flannel
2021-09-15 18:48:10 +02:00
Brad Davidson
85e11c47d1 Add StargzSupported stub for Windows
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-15 09:45:57 -07:00
Chris Kim
acf9036b63
No-op when etcd member was already removed and use existing name for etcd controller (#4014)
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-09-15 08:41:30 -07:00
Manuel Buil
9fcd79baae Add tests to the dual-stack PR and enable dual-stack with flannel backend
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-09-15 14:11:54 +02:00
Manuel Buil
681058bb40 Add dual-stack support
Signed-off-by: Manuel Buil <mbuil@suse.com>
2021-09-15 11:44:48 +02:00
Brad Davidson
b72306ce3d Return the error since it just gets logged and retried anyways
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-14 16:41:27 -07:00
Brad Davidson
5986898419 Use SubjectAccessReview to validate CCM RBAC
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-14 16:41:27 -07:00
Brad Davidson
dc556cbb72 Set controller authn/authz kubeconfigs
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-14 16:41:27 -07:00
Brad Davidson
199424b608 Pass context into all Executor functions
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-14 16:41:27 -07:00
Chris Kim
928b8531c3
[master] Add etcd-member-management controller to K3s (#4001)
* Initial leader elected etcd member management controller
* Bump etcd to v3.5.0-k3s2

Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-09-14 08:20:38 -07:00
Brad Davidson
57377d2cd4 Minor cleanup on cribbed function
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-09-10 17:04:15 -07:00