As highlighted in the official Kubernetes documentation
By default the kubelet serving certificate deployed by kubeadm is self-signed. This means a connection from external services like the metrics-server to a kubelet cannot be secured with TLS.
Setting up a testing cluster with a newly deployed metrics server often results in the following error message: “Failed to scrape node, err=Get https://172.18.0.3:10250/metrics/resource: x509: cannot validate certificate for 172.18.0.3 because it doesn’t contain any IP SANs node=kind-worker”. This can be frustrating.
For more information, I recommend checking out the discussion on Issue 196.
In this post, I will demonstrate how to solve this problem in KinD. The solution I present is applicable to any Kubernetes cluster set up using Kubeadmin.
Reproduce
Setup Kubernetes without additional Kubeadm init configuration.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
cat << EOF | kind create cluster --config -
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node:v1.26.2
- role: worker
image: kindest/node:v1.26.2
- role: worker
image: kindest/node:v1.26.2
networking:
podSubnet: "10.244.0.0/16"
serviceSubnet: "10.96.0.0/12"
EOF
|
Deploy metrics server
1
|
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.3/components.yaml
|
and errors occour
1
2
|
E0423 08:43:11.181966 1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.2:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.2 because it doesn't contain any IP SANs" node="kind-worker"
E0423 08:43:11.193158 1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.18.0.3:10250/metrics/resource\": x509: cannot validate certificate for 172.18.0.3 because it doesn't contain any IP SANs" node="kind-worker2"
|
The metrics server won’t be ready
1
2
3
4
5
6
7
|
# k -n kube-system get po -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-6757d65f8-dk94t 0/1 Running 0 4m33s
# k top no
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
# k top po
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
|
Temporarily Solution
It can be solved by adding args --kubelet-insecure-tls
, but is not a ideal solution
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
# ......
spec:
template:
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
-- --kubelet-insecure-tls ## append this arg
image: zengxu/metrics-server:v0.6.3
|
It goes Ready.
1
2
3
4
5
6
7
8
9
10
|
#
# k -n kube-system get po -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-7499c765d9-bw8rv 1/1 Running 0 103s
# k top no
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
kind-control-plane 191m 1% 582Mi 0%
kind-worker 43m 0% 145Mi 0%
kind-worker2 30m 0% 126Mi 0%
|
Ideal Solution
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
cat << EOF | kind create cluster --config -
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
image: kindest/node-amd64:v1.26.2
kubeadmConfigPatches: # -----+
- | # | (setup cluster with
kind: KubeletConfiguration # | patches)
serverTLSBootstrap: true # -----+
- role: worker
image: kindest/node-amd64:v1.26.2
- role: worker
image: kindest/node-amd64:v1.26.2
networking:
podSubnet: "10.244.0.0/16"
serviceSubnet: "10.96.0.0/12"
EOF
|
When it comes ready, apply metrics-server manifests
1
|
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.6.3/components.yaml
|
Errors throw when connected to kubectl
1
2
|
# k -n kube-system logs metrics-server-6757d65f8-tfwb5
Error from server: Get "https://172.18.0.3:10250/containerLogs/kube-system/metrics-server-6757d65f8-tfwb5/metrics-server": remote error: tls: internal error
|
Because Kubelet’s certificate requests aren’t approved.
1
2
3
4
5
6
7
8
|
# k -n kube-system get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-brvcz 85s kubernetes.io/kubelet-serving system:node:kind-control-plane <none> Pending
csr-c24zs 91s kubernetes.io/kubelet-serving system:node:kind-control-plane <none> Pending
csr-k4ggc 67s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:abcdef <none> Approved,Issued
csr-pbbxh 65s kubernetes.io/kubelet-serving system:node:kind-worker2 <none> Pending
csr-r8gk7 67s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:abcdef <none> Approved,Issued
csr-srh22 65s kubernetes.io/kubelet-serving system:node:kind-worker <none> Pending
|
Approve kubelet certificate requests
1
|
for kubeletcsr in `kubectl -n kube-system get csr | grep kubernetes.io/kubelet-serving | awk '{ print $1 }'`; do kubectl certificate approve $kubeletcsr; done
|
Everything works as expected
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
# k -n kube-system get po -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-6757d65f8-tfwb5 1/1 Running 0 3m42s
# k -n kube-system top po
NAME CPU(cores) MEMORY(bytes)
coredns-787d4945fb-2dwn9 4m 13Mi
coredns-787d4945fb-cl288 2m 12Mi
etcd-kind-control-plane 34m 30Mi
kindnet-hql7g 1m 8Mi
kindnet-jdxl6 1m 7Mi
kindnet-xrdkl 1m 8Mi
kube-apiserver-kind-control-plane 67m 263Mi
kube-controller-manager-kind-control-plane 26m 43Mi
kube-proxy-4fc8z 1m 11Mi
kube-proxy-hpckv 2m 11Mi
kube-proxy-t275x 2m 11Mi
kube-scheduler-kind-control-plane 6m 18Mi
metrics-server-6757d65f8-tfwb5 5m 15Mi
# k top no
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
kind-control-plane 164m 1% 577Mi 0%
kind-worker 33m 0% 147Mi 0%
kind-worker2 24m 0% 119Mi 0%
|
Author
Zeng Xu
LastMod
2023-04-23 17:53
License
本作品采用 知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议 进行许可,转载时请注明原文链接。