Kubernetes 高级调度
节点亲和性调度
节点亲和性分类
节点亲和性调度主要分为硬亲和性调度 requiredDuringSchedulingIgnoredDuringExecution
和 软亲和性调度 preferredDuringSchedulingIgnoredDuringExecution
硬亲和性调度: 必须满足指定条件才调度,否则不调度
软亲和性调度: 优先考虑指定节点,实在不满足也行
节点硬亲和性调度
nginx-deploy.yaml
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: nginx
5spec:
6 replicas: 3
7 selector:
8 matchLabels:
9 app: nginx
10 template:
11 metadata:
12 labels:
13 app: nginx
14 spec:
15 affinity:
16 nodeAffinity:
17 requiredDuringSchedulingIgnoredDuringExecution:
18 nodeSelectorTerms:
19 - matchExpressions:
20 - key: kubernetes.io/hostname
21 operator: "In"
22 values:
23 - "k8s-m2"
24 containers:
25 - name: nginx
26 image: nginx:1.20.1-alpine
27 resources:
28 limits:
29 memory: "256Mi"
30 cpu: "250m"
31 ports:
32 - containerPort: 80
33 name: http
34 protocol: TCP
创建
1kubectl apply -f nginx-deploy.yaml
观察 Pod 调度情况
1kubectl get pods -l app=nginx
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-59c46cc77d-grwq5 1/1 Running 0 5s 10.244.1.84 k8s-m2 <none> <none>
4nginx-59c46cc77d-hvn6n 1/1 Running 0 5s 10.244.1.85 k8s-m2 <none> <none>
5nginx-59c46cc77d-z8rrn 1/1 Running 0 5s 10.244.1.83 k8s-m2 <none> <none>
看到只调度到 k8s-m2 上
节点软亲和性调度
nginx-deploy.yaml
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: nginx
5spec:
6 replicas: 3
7 selector:
8 matchLabels:
9 app: nginx
10 template:
11 metadata:
12 labels:
13 app: nginx
14 spec:
15 affinity:
16 nodeAffinity:
17 preferredDuringSchedulingIgnoredDuringExecution:
18 - preference:
19 matchExpressions:
20 - key: kubernetes.io/hostname
21 operator: "In"
22 values:
23 - "k8s-m2"
24 weight: 10 # 取值范围:1-100
25 containers:
26 - name: nginx
27 image: nginx:1.20.1-alpine
28 resources:
29 limits:
30 memory: "256Mi"
31 cpu: "500m"
32 ports:
33 - containerPort: 80
34 name: http
35 protocol: TCP
创建
1kubectl apply -f nginx-deployment.yaml
查看调度情况
1kubectl get po -l app=nginx
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-544888dd59-b58ch 1/1 Running 0 5s 10.244.1.86 k8s-m2 <none> <none>
4nginx-544888dd59-m8t28 1/1 Running 0 5s 10.244.1.87 k8s-m2 <none> <none>
5nginx-544888dd59-x8pmf 1/1 Running 0 5s 10.244.2.69 k8s-m3 <none> <none>
根据设置的权重值以及节点的实际情况,还是有一定的几率调度到其他节点。
Pod
亲和/反亲和性调度
Pod
亲和性分类
Pod
亲和性调度又分为 Pod
亲和性调度 podAffinity
和 Pod
反亲和性调度 podAntiAffinity
每一个下面同时又分为 硬亲和 requiredDuringSchedulingIgnoredDuringExecution
和软亲和 preferredDuringSchedulingIgnoredDuringExecution
Pod 硬亲和性调度
案例 同一 pod
调度到同一节点
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: nginx
5spec:
6 replicas: 3
7 selector:
8 matchLabels:
9 app: nginx
10 template:
11 metadata:
12 labels:
13 app: nginx
14 spec:
15 affinity:
16 podAffinity:
17 requiredDuringSchedulingIgnoredDuringExecution:
18 - topologyKey: "kubernetes.io/hostname"
19 labelSelector:
20 matchLabels:
21 app: nginx
22 containers:
23 - name: nginx
24 image: nginx:1.20.1-alpine
25 resources:
26 limits:
27 memory: "256Mi"
28 cpu: "250m"
29 ports:
30 - containerPort: 80
31 name: http
32 protocol: TCP
观察 Pod
调度情况
1kubectl get po -l app=nginx -owide
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-5dcdcdbd48-dhk4x 1/1 Running 0 6s 10.244.2.76 k8s-m3 <none> <none>
4nginx-5dcdcdbd48-jncps 1/1 Running 0 6s 10.244.2.78 k8s-m3 <none> <none>
5nginx-5dcdcdbd48-x2wlr 1/1 Running 0 6s 10.244.2.77 k8s-m3 <none> <none>
同一 Pod
不允许调度到同一节点
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: nginx
5spec:
6 replicas: 3
7 selector:
8 matchLabels:
9 app: nginx
10 template:
11 metadata:
12 labels:
13 app: nginx
14 spec:
15 affinity:
16 podAntiAffinity:
17 requiredDuringSchedulingIgnoredDuringExecution:
18 - topologyKey: "kubernetes.io/hostname"
19 labelSelector:
20 matchLabels:
21 app: nginx
22 containers:
23 - name: nginx
24 image: nginx:1.20.1-alpine
25 resources:
26 limits:
27 memory: "256Mi"
28 cpu: "250m"
29 ports:
30 - containerPort: 80
31 name: http
32 protocol: TCP
观察
1NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
2nginx-d859b58db-2mxq7 1/1 Running 0 4s 10.244.1.89 k8s-m2 <none> <none>
3nginx-d859b58db-t87qg 1/1 Running 0 4s 10.244.2.79 k8s-m3 <none> <none>
4nginx-d859b58db-tqsdb 0/1 Pending 0 4s <none> <none> <none> <none>
看到 2 个节点调度到不同节点上了,另一节点默认打了 NoSchedule
标签,不会调度。
Pod 软亲和性调度
同一 Pod
尽可能调度到到同一节点
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: nginx
5spec:
6 replicas: 3
7 selector:
8 matchLabels:
9 app: nginx
10 template:
11 metadata:
12 labels:
13 app: nginx
14 spec:
15 affinity:
16 podAffinity:
17 preferredDuringSchedulingIgnoredDuringExecution:
18 - podAffinityTerm:
19 topologyKey: "kubernetes.io/hostname"
20 labelSelector:
21 matchExpressions:
22 - key: app
23 operator: "In"
24 values:
25 - "nginx"
26 weight: 50
27 containers:
28 - name: nginx
29 image: nginx:1.20.1-alpine
30 resources:
31 limits:
32 memory: "256Mi"
33 cpu: "250m"
34 ports:
35 - containerPort: 80
36 name: http
37 protocol: TCP
1kubectl get po -l app=nginx
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-f744fbb8f-6lx2c 1/1 Running 0 16s 10.244.1.93 k8s-m2 <none> <none>
4nginx-f744fbb8f-cwhn9 1/1 Running 0 20s 10.244.1.91 k8s-m2 <none> <none>
5nginx-f744fbb8f-d9n6w 1/1 Running 0 18s 10.244.1.92 k8s-m2 <none> <none>
当资源满足的条件下,看到调度到同一节点。修改 CPU 或者内存导致资源不足的情况下会调度到其他节点。
同一 Pod
尽量不要调度到同一节点
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: nginx
5spec:
6 replicas: 3
7 selector:
8 matchLabels:
9 app: nginx
10 template:
11 metadata:
12 labels:
13 app: nginx
14 spec:
15 affinity:
16 podAntiAffinity:
17 preferredDuringSchedulingIgnoredDuringExecution:
18 - podAffinityTerm:
19 topologyKey: "kubernetes.io/hostname"
20 labelSelector:
21 matchExpressions:
22 - key: app
23 operator: "In"
24 values:
25 - "nginx"
26 weight: 50
27 containers:
28 - name: nginx
29 image: nginx:1.20.1-alpine
30 resources:
31 limits:
32 memory: "256Mi"
33 cpu: "250m"
34 ports:
35 - containerPort: 80
36 name: http
37 protocol: TCP
1kubectl get po -l app=nginx
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-59bb4cf66c-2546f 1/1 Running 0 4s 10.244.2.83 k8s-m3 <none> <none>
4nginx-59bb4cf66c-pdv2c 1/1 Running 0 4s 10.244.1.94 k8s-m2 <none> <none>
5nginx-59bb4cf66c-w79xj 1/1 Running 0 4s 10.244.2.82 k8s-m3 <none> <none>
这里 k8s-m1
打了 NoSchedule
标签,不满足条件的就调度到同一节点上了。
污点与容忍
NoSchedule 没有容忍该污点的不会调度到具有该污点的节点 NoExecute 没有容忍该污点的会被驱逐出去 PreferNoSchedule 尽量不要调度到改节点
NoSchedule 示例: 容忍 NoSchedule
污点
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: nginx
5spec:
6 replicas: 3
7 selector:
8 matchLabels:
9 app: nginx
10 template:
11 metadata:
12 labels:
13 app: nginx
14 spec:
15 affinity:
16 podAntiAffinity:
17 requiredDuringSchedulingIgnoredDuringExecution:
18 - topologyKey: "kubernetes.io/hostname"
19 labelSelector:
20 matchLabels:
21 app: nginx
22 containers:
23 - name: nginx
24 image: nginx:1.20.1-alpine
25 resources:
26 limits:
27 memory: "256Mi"
28 cpu: "250m"
29 ports:
30 - containerPort: 80
31 name: http
32 protocol: TCP
33 tolerations:
34 - key: node-role.kubernetes.io/master
35 operator: Equal
36 effect: NoSchedule
观察
1kubectl get po -l app=nginx
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-6587d9989c-9wjqm 1/1 Running 0 7s 10.244.1.95 k8s-m2 <none> <none>
4nginx-6587d9989c-gmrt2 1/1 Running 0 7s 10.244.0.20 k8s-m1 <none> <none>
5nginx-6587d9989c-tdzvh 1/1 Running 0 7s 10.244.2.84 k8s-m3 <none> <none>
之前做了硬限制,只调度2个节点,由于 k8s-m1
有 NodeSchdule
污点,无法调度,这里容忍了该节点,已经成功调度到 k8s-m1
节点上。 kube-system
下很多 Pod
都是容忍该污点的,比如 etcd
、kube-apiserver
、kube-controller-manager
、kube-scheduler
、kube-proxy
、网络插件等。
Tolerations: :NoSchedule op=Exists
有兴趣的读者可以自行研究。
NoExecute
示例
给节点打上污点时,不能容忍该污点的 Pod
会被驱逐出去
1kubeclt taint nodes k8s-m1 master:NoExecute
观察效果
1kgp -l app=nginx -owide
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-6587d9989c-9wjqm 1/1 Running 0 8m34s 10.244.1.95 k8s-m2 <none> <none>
4nginx-6587d9989c-cv25q 0/1 Pending 0 3s <none> <none> <none> <none>
5nginx-6587d9989c-gmrt2 0/1 Terminating 0 8m34s 10.244.0.20 k8s-m1 <none> <none>
6nginx-6587d9989c-tdzvh 1/1 Running 0 8m34s 10.244.2.84 k8s-m3 <none> <none>
看到 k8s-m1
节点上的 nginx
pod 被立即驱逐出去。
取消污点
1kubectl taint node k8s-m1 master:NoExecute
再次观察 Pod
又可以正常调度了。
固定节点调度
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: nginx
5spec:
6 selector:
7 matchLabels:
8 app: nginx
9 template:
10 metadata:
11 labels:
12 app: nginx
13 spec:
14 nodeSelector:
15 diskType: ssd
16 containers:
17 - name: nginx
18 image: nginx:1.20.1-alpine
19 resources:
20 limits:
21 memory: "256Mi"
22 cpu: "250m"
23 ports:
24 - containerPort: 80
25 name: http
26 protocol: TCP
创建
1kubectl apply -f nginx-deploy.yaml
当前节点没有标签为 diskType=ssd
的,因此没有节点可以调度
1kubectl get po -l app=nginx -owide
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-554c6797c-z6plx 0/1 Pending 0 11s <none> <none> <none> <none>
给节点打标签
1kubectl label node k8s-m2 diskType=ssd
再次查看 Pod
,已经调度到 k8s-m2
节点了。
1kubectl get po -l app=nginx -owide
2NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
3nginx-554c6797c-z6plx 1/1 Running 0 34s 10.244.1.96 k8s-m2 <none> <none>
常见应用场景
GPU 调度,比如运行需要 GPU 的 Pod。 SSD 磁盘,对于I/O密集型的业务,比如数据库,缓存,可以将节点调度至具有 SSD 磁盘的节点上。
- 原文作者:黄忠德
- 原文链接:https://huangzhongde.cn/post/Kubernetes/Kubernetes%E9%AB%98%E7%BA%A7%E8%B0%83%E5%BA%A6%E7%AD%96%E7%95%A5/
- 版权声明:本作品采用知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议进行许可,非商业转载请注明出处(作者,原文链接),商业转载请联系作者获得授权。