Introduction and Applicaiton of Kubernetes

Cloud base architecture / Container orchestration tool

Posted by Mr. Liu on 2022-05-23
Estimated Reading Time 48 Minutes
Words 9.2k In Total
Viewed Times

Introduction and Applicaiton of Kubernetes

What is Kubernetes?

  • Open source container orchestration tool

  • Developed by Google

  • Help you manage containerized applications in different deployment ( virtual/ physical)

Problems Solved?

  • Trend from Monolith to Microserives

Emerging from the agile practitioner communities, the microservice-oriented architecture emphasizes implementing and employing multiple small-scale and independently deployable microservices, rather than encapsulating all function capabilities into one monolithic application.

  • Increased usage of containers

Gradually abundant virtual machine

en

容器是将操作系统层虚拟化,虚拟机则是虚拟化硬件

Container runtime includes docker, containerd, podman etc.

  • Demand for a proper way of managing these hundreds of containers

Tasks of an orchestration tool?

  • High Availability or no downtime

  • Scalability or high performance

  • Disaster recovery-backup and restore

Kubernetes Architecture

Worker processes

Worker machine in K8s cluster

  • Each Node has multiple Pods on it

  • 3 processes must be installed on every Node

  • Worker nodes do the actual work

3 processes

  • Container runtime (etc. docker)

  • Kubelet (interacts with both container and node)

Kubelet是工作节点上的主要服务,定期从kube-apiserver(在master process上)组件接收新的或修改的Pod规范,并确保Pod及其容器在期望规范下运行。同时该组件作为工作节点的监控组件,向kube-apiserver汇报主机的运行状况。一文看懂 Kubelet

  • Kube proxy (fowarding requests from services to pods)

管理service的访问入口,包括集群内Pod到Service的访问和集群外访问service。kubernetes核心组件kube-proxy

Master Processes

4 processes

en

  • Api Server

cluster gateway

dashboard, command line

Acts as a gatekeeper for authentication

k8s API Server提供了k8s各类资源对象(pod,Service等)的增删改查及watch等HTTP Rest接口,是整个系统的数据总线和数据中心。

  • Scheduler

According to the resource of components, scheduler can decide on putting this pod to which node.
More details can be seen in the latter section.

  • Controller manager

Detects cluster state change (crashing of pods)

  • etcd

    • A key-value store of cluster state
    • Cluster brain
    • Cluster changes get stored in the key value store.
    • All other master processes are based on etcd.
      • Is the cluster healthy?
      • What resources are availabel?
      • Did the cluster state change?

And application data is not stored in etcd. It is stored in the DB.

Load Balance

Multiple master node

Compare

Resources: master nodes more, worker nodes less.

Brief Introduction

Kubernetes 功能

  • 服务发现和负载均衡

Kubernetes 可以使用 DNS 名称或自己的 IP 地址公开容器,如果进入容器的流量很大, Kubernetes 可以负载均衡并分配网络流量,从而使部署稳定。

  • 存储编排

Kubernetes 允许你自动挂载你选择的存储系统,例如本地存储、公共云提供商等。

  • 自动部署和回滚

你可以使用 Kubernetes 描述已部署容器的所需状态,它可以以受控的速率将实际状态 更改为期望状态。例如,你可以自动化 Kubernetes 来为你的部署创建新容器, 删除现有容器并将它们的所有资源用于新容器。

  • 自动完成装箱计算

Kubernetes 允许你指定每个容器所需 CPU 和内存(RAM)。 当容器指定了资源请求时,Kubernetes 可以做出更好的决策来管理容器的资源。

  • 自我修复

Kubernetes 重新启动失败的容器、替换容器、杀死不响应用户定义的 运行状况检查的容器,并且在准备好服务之前不将其通告给客户端。

  • 密钥与配置管理

Kubernetes 允许你存储和管理敏感信息,例如密码、OAuth 令牌和 ssh 密钥。 你可以在不重建容器镜像的情况下部署和更新密钥和应用程序配置,也无需在堆栈配置中暴露密钥。

Kubernetes Components

Node

A simple server, physical or virtual machine

Pod

  • Smallest unit of K8s

  • Abstraction over container

  • Usually 1 app container is inside of it (also, it can have more containers.)

  • Each Pod gets its own IP address

  • New IP address on re-creation (when a Pod die.)

en
What pod does? Create this running environment or a layer on top of the container

Why use pod? Because K8s wants to abstract away the container runtime or container technologies. (Try not to operate docker directly)

Service

  • Permanent IP address

  • Lifecycle of Pod and Service Not Connected (when a pod dies, the service and its address will stay)

  • External service vs Internal Service

Deployment

Replicate everything in order to avoid a bad experience. Do each replicate need a IP? No.

In this, we should resort to Service. Because Service has permanent IP and load balancer.

So what is Deployment?

Blueprint for pods. So you can scale up or scale down the number of replicas. Deployment is an abstraction on top of pods. Mostly, you work with deployment and not with pods.

However, DB can not be replicated via Deployment. Because database has a state which is its data meaning that is we have clones or replicas, they would access the same shared data storage. The writing and reading of pods will cause the inconsistency of DB. And StatefulSet will solve it.

Ingress

link

When we have plenty of services, it is insufficient to give Nodeport to each service. So we can use ingress to map this services.

en

ConfigMap

link

ConfigMap是一种API对象,用来将非加密数据保存到键值对中。可以用作环境变量、命令行参数或者存储卷中的配置文件。

ConfigMap可以将环境变量配置信息和容器镜像解耦,便于应用配置的修改。如果需要存储加密信息时可以使用Secret对象。

  • 将ConfigMap中的数据设置为容器的环境变量 (启动后不可改变)
  • 将ConfigMap中的数据设置为命令行参数
  • 使用Volume将ConfigMap作为文件或目录挂载 (pod的环境变量可以随配置文件而改变)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
apiVersion: v1
kind: deployment
template:
metadata:
labels:
app: mongodb-express
spec:
containers:
- name: mongodb-express
image:
ports:
- containerPort: 8081 # different containers map expose different port
env:
- name: ME_CONFIG_MONGODB_SERVER
valuseFrom:
configMapKeyRef:
name: mongodb-configmap
key: database_url

___
apiVersion: v1
kind: Service
metadata:
name: mongo-express-service
spec:
selector:
app: mongo-express
ports:
-protocol: TCP
port: 8081
targetPort: 8081
1
2
3
4
5
6
apiVersion: v1
kind: ConfigMap
metadata:
name: mongodb-configmap
data:
database_url: mongodb-service

Secrets

Has the similar function as ConfigMap and it is used to store some secrets (base64 encode)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
kind: deployment
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image:
ports:
- containerPort: 27017 # different containers map expose different port
env:
- name: MONGO_INITDB_ROOT_USERNAME
valuseFrom:
secretKeyRef:
name: mongodb-secret
key: mongo-root-username
```

```YAML
apiVersion:
kind: Secret
metadata:
name: mongodb-secret
type: Opaque
data:
mongo-root-username: dXNlcm5hbWU= #base64 encode ('username')
mongo-root-password: CGFzc3dvcmQ= #base64 encode ('password')

Volumes

Data storage

Storage on a local machine or remote, outside of the K8s cluster
K8s does not manage data persistence. And you do it.

StatefulSet

For stateful apps, such as mongo DB, MySQL

It would take care of replicating the pods and scaling them up or scaling them down to making sure the database reads and writes are synchronized.

en

How to use it on a personal device?

The answer is minikube!

What is minikube?

  • Creates Virtual Box on your laptop.
  • Node runs in that virtual boxx
  • 1 node K8s cluster

The ways to communicate with Api server

  • UI
  • API
  • CLI (kubectl)

Basic instructions

预装minikube(本地环境部署K8S)minikube start

minikube能把本地机虚拟化成一个单Node的集群(需要机器预装docker)

1
2
minikube version
minikube start
1
2
3
kubectl version
kubectl cluster-info
kubectl get nodes #查看nodes

Kubernetes Pod

Create Pod

使用配置创建pod

Use the instructions of docker to pull images from the remote repository to the local one
Linux(CentOS7)安装Docker,镜像拉取、使用及常用操作

For example: Simple configuration file (hello-k8s-pod.yaml)

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion:v1
kind: Pod
metadata:
name: k8s-httpd
spec:
containers:
- image: httpd #本地镜像仓库有一个httpd的镜像文件
imagePullPolicy: IfNotPresent
name: httpd
ports:
- containerPort: 8080
protocol: TCP
1
2
3
4
5
6
7
kubectl create -f hello-k8s-pod.yaml
kubectl get pods

#NAME READY STATUS RESTARTS AGE
#k8s-httpd 0/1 ContainerCreating 0 16s
#when lasting some minutes or seconds, ContainerCreating will become Running
#k8s-httpd 1/1 Running 0 5m21s

As some senior K8s developer said, usually we do not directly connect with pod and we use deployment.

1
kubectl apply -f [file name]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx: 1.16
ports:
- containerPort:80
```

And we should know a point that K8s will not recreate a same deployment when you apply a same deployment.

**使用指令创建pod**

```bash
kubectl create deployment hello-minikube --image=k8s.gcr.io/echoserver:1.4
#deployment.apps/hello-minikube created
kubectl expose deployment hello-minikube --type=NodePort --port=8080
#service/hello-minikube exposed
kubectl get services hello-minikube
#NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
#hello-minikube NodePort 10.101.121.221 <none> 8080:30155/TCP 116s
  • ClusterIP: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType.
  • NodePort: Exposes the Service on each Node’s IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You’ll be able to contact the NodePort Service, from outside the cluster, by requesting :. nodeport为对外端口
1
2
3
4
5
6
7
8
9
10
11
12
13
kubectl get deployments #查看部署信息

echo -e "\n\n\n\e[92mStarting Proxy. After starting it will not output a response. Please click the first Terminal Tab\n"

kubectl proxy #启动代理,可使K8S直接与host通信

curl http://localhost:8001/version #查询版本

export POD_NAME=$(kubectl get pods -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}')

echo Name of the Pod: $POD_NAME

curl http://localhost:8001/api/v1/namespaces/default/pods/$POD_NAME/ #查询pode名字

查看 pod 和工作节点

What is a pod?

Pods are the smallest deployable units of computing that you can create and manage in Kubernetes.

A Pod (as in a pod of whales or pea pod) is a group of one or more containers, with shared storage and network resources, and a specification for how to run the containers.

Note:

Restarting a container in a Pod should not be confused with restarting a Pod. A Pod is not a process, but an environment for running container(s). A Pod persists until it is deleted.

en

  • kubectl get pods- 列出资源

  • kubectl describe pods- 显示有关资源的详细信息

    主要描述pod的容器的一些信息,包括IP、image、端口、发生在pod上的一些事件

  • kubectl logs - 打印 pod 和其中容器的日志

  • kubectl exec - 在 pod 中的容器上执行命令

检查指定pod的logs

1
2
kubectl logs $POD_NAME
查看指定pod的环境变量
1
2
kubectl exec $POD_NAME -- env
启动一个Pod 的bash命令行
1
2
kubectl exec -ti $POD_NAME -- bash
exit #离开容器

Step into a pod

1
2
3
4
5
6
7
8
9
10
11
kubectl exec k8s-httpd -- ls   
#bin
#build
#cgi-bin
#conf
#error
#htdocs
#icons
#include
#logs
#modules

Let us review how to step into a container!

1
kubectl exec -it my-pod --container main-app -- /bin/bash # when a pod (my-pod) has more than one container, we can step in a specific container (main-app)
1
2
3
docker exec -ti minikube ls #step into the minikube(container) and execute the command ls

docker exec -ti minikube ps -ef | grep kube # result shown in the following picture

en

Let us find the container httpd (we built before).

1
docker exec -ti minikube docker exec -ti k8s-httpd ls

en

1
docker exec -ti minikube docker exec -ti k8s_k8s-httpd_k8s-httpd_default_418d0482-e835-4e81-823e-efbe0d58ddf5_0 ps -ef | grep httpd

en

Pod update and replacement

How to edit a deployment?

1
2
3
kubectl edit deployment nginx-depl
# it will open a file and you can edit it to update deployment.
# then the old pod will be Terminating and new pod will be running.

when the Pod template for a workload resource is changed, the controller creates new Pods based on the updated template instead of updating or patching the existing Pods.

Pod updates may not change fields other than spec.containers[].image,spec.initContainers[].image,spec.activeDeadlineSecondsor spec.tolerations. For spec.tolerations, you can only add new entrie.

Let us have a look at initContainers! (when initContainers start successfully, it will start up containers.)

myapp.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app: myapp
spec:
containers:
- name: myapp-container
image: busybox:1.28
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
initContainers:
- name: init-myservice
image: busybox:1.28
command: ['sh', '-c', "until nslookup myservice.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done"]
- name: init-mydb
image: busybox:1.28
command: ['sh', '-c', "until nslookup mydb.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done"]

This example defines a simple Pod that has two init containers. The first waits for myservice

, and the second waits for mydb. Once both init containers complete, the Pod runs the app container from its spec section.

We can see that this pod will run until these two init complete.

1
2
#NAME        READY     STATUS     RESTARTS   AGE
#myapp-pod 0/1 Init:0/2 0 6m

删除deployment

1
2
kubectl delete deployment mongo-db
#then the pods in this kubectl will be terminating.

Summary

en

en

en

About the delete operation, we can say more.

We can combine these two commands to delete many objects.

1
2
3
kubectl get deployment nginx-depl -o yaml > nginx-dep-result.yaml
kubectl delete -f nginx-dep-result.yaml
# If we want to delete service, we just need to output service yaml file.

Kubernetes Configuration File

3 parts

Meta data

deployment

1
2
3
4
5
6
7
8
9
apiVersion:app/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
spec:
replicas:
selector:
tempalte:

Service

1
2
3
4
5
6
7
8
apiVersion:v1
kind: Service
metadata:
name: ngnix-service #specify the name
spec:
selector:
app: nginx
ports:

Specification

It is specific to the kind!

Status

It is saved by K8s and we do not connect with it. It will judge whether the actual object matches what we desired or not. And this data is saved in etcd.

And how do we check it ?

1
2
kubectl get deployment nginx-depl -o yaml > nginx-dep-result.yaml
#Now the status will be outputed into a yaml file and you can check it.

apiVersion

K8S的apiVersion该用哪个

  • Alpha:
    • The version names contain alpha (for example, v1alpha1).
    • The software may contain bugs. Enabling a feature may expose bugs. A feature may be disabled by default.
    • The support for a feature may be dropped at any time without notice.
    • The API may change in incompatible ways in a later software release without notice.
    • The software is recommended for use only in short-lived testing clusters, due to increased risk of bugs and lack of long-term support.
  • Beta:
    • The version names contain beta (for example, v2beta3).
    • The software is well tested. Enabling a feature is considered safe. Features are enabled by default.
    • The support for a feature will not be dropped, though the details may change.
    • The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, migration instructions are provided. Schema changes may require deleting, editing, and re-creating API objects. The editing process may not be straightforward. The migration may require downtime for applications that rely on the feature.
    • The software is not recommended for production uses. Subsequent releases may introduce incompatible changes. If you have multiple clusters which can be upgraded independently, you may be able to relax this restriction.
  • Stable:
    • The version name is vX where X is an integer.
    • The stable versions of features appear in released software for many subsequent versions.

For example, suppose there are two API versions, v1 and v1beta1, for the same resource. If you originally created an object using the v1beta1 version of its API, you can later read, update, or delete that object using either the v1beta1 or the v1 API version.

There are several API groups in Kubernetes:

  • The core (also called legacy) group is found at REST path /api/v1. The core group is not specified as part of the apiVersion field, for example, apiVersion: v1.

  • The named groups are at REST path /apis/$GROUP_NAME/$VERSION and use apiVersion: $GROUP_NAME/$VERSION (for example, apiVersion: batch/v1). You can find the full list of supported API groups in Kubernetes API reference.

In the last link, we can find the different API versions and component kinds. And most we need is in the core group.

1
apiVersion:v1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
kubectl api-versions
#admissionregistration.k8s.io/v1
#apiextensions.k8s.io/v1
#apiregistration.k8s.io/v1
#apps/v1
#authentication.k8s.io/v1
#authorization.k8s.io/v1
#autoscaling/v1
#autoscaling/v2beta1
#autoscaling/v2beta2
#batch/v1
#batch/v1beta1
#certificates.k8s.io/v1
#coordination.k8s.io/v1
#discovery.k8s.io/v1
#discovery.k8s.io/v1beta1
#events.k8s.io/v1
#events.k8s.io/v1beta1
#flowcontrol.apiserver.k8s.io/v1beta1
#networking.k8s.io/v1
#node.k8s.io/v1
#node.k8s.io/v1beta1
#policy/v1
#policy/v1beta1
#rbac.authorization.k8s.io/v1
#scheduling.k8s.io/v1
#storage.k8s.io/v1
storage.k8s.io/v1beta1
#v1

Template

It has its own metadata and spec. Because the template will be applied to the pod and pod has its own configuration.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion:app/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
spec:
replicas:
selector:
tempalte:
metadata:
spec: #blueprint of pod
containers:
- name:
image:
ports:
- containerPort:

Connection

Labels & selectors

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
apiVersion:apps/v1
kind: Deployment
metadata:
name: nginx-deployment # service
labels:
app: nginx #we can see the same app name in service. So these two are connected. So the service know which deployment belong to it.
spec:
replicas: 2
selector:
matchLabels:
app: ngix #deployment will know which pods belong to it.
tempalte:
metadata:
labels:
app:nginx #we can see the same app name in service. So these two are connected. So the service know which pod belong to it.
spec:
```

```yaml
apiVersion:v1
kind: Service
metadata:
name: ngnix-service #specify the name
spec:
selector:
app: nginx
ports:

Port

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion:apps/v1
kind: Deployment
metadata:
name: nginx-deployment # service
labels:
app: nginx #we can see the same app name in service. So these two are connected. So the service know which deployment belong to it.
spec:
replicas: 2
selector:
matchLabels:
app: ngix #deployment will know which pods belong to it.
tempalte:
metadata:
labels:
app:nginx #we can see the same app name in service. So these two are connected. So the service know which pod belong to it.
spec:
containers:
- name:
image:
ports:
- containerPort: 8080
1
2
3
4
5
6
7
8
9
10
11
apiVersion:v1
kind: Service
metadata:
name: ngnix-service #specify the name
spec:
selector:
app: nginx
ports:
-protocol: TCP
port: 80 # other services should visit it by 80
targetPort: 8080 # service will forward the request by 8080 to pod

Kubernetes Deployment

使用 Service 暴露您的应用

  • Kubernetes 中的服务(Service)是一种抽象概念,它定义了 Pod 的逻辑集和访问 Pod 的协议。
  • Service 通过一组 Pod 路由通信。Service 是一种抽象,它允许 Pod 死亡并在 Kubernetes 中复制,而不会影响应用程序。
1
2
3
4
5
6
7
kubectl get services #查看服务
kubectl expose deployment/kubernetes-bootcamp --type="NodePort" --port 8080 #在指定端口暴露相应app的服务
kubectl describe services/kubernetes-bootcamp #查看一个服务的具体信息
kubectl get pods -l app=kubernetes-bootcamp #查看指定pod的信息
kubectl get services -l app=kubernetes-bootcamp #查看指定service的信息
kubectl label pods $POD_NAME version=v1 #给pod添加一个Label, 可以通过kubectl get pods -l version=v1查看
kubectl delete service -l app=kubernetes-bootcamp #删除服务,注意删除服务仅是删除了外部接触这个pod的接口,这个pod仍然在运行,可以在本地访问到,如果需要删除这个pod,需要删除相应的deployment
1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
name:
spec:
selector:
app:
type: LoadBalancer #assign service an external IP address and so accepts external requests
ports:
- protocol: TCP
port: 8081
targetPort: 8081
nodePort: 30000 #port for external IP address and port you need to put into browser(30000-32767)

Types of services

  • ClusterIP: just have Cluster-IP
  • NodePort: have Cluster-IP and External-IP
  • LoadBalancer: have Cluster-IP and External-IP

Simple Demo

en

Kubernetes Services

What

Each Pod has its own address

Pods are ephemeral - are destroyed frequently!

Servce:

  • Stable IP address
  • loadbalancing
  • Loose coupling
  • Within & outside cluster

Service types

ClusterIP

Default type

Two containers in it

Microservice app deployed

Side-car container: collects microservice logs

Port vs targetPort

1
2
3
4
ports:
- protocol: TCP
port: 3200 # service port is arbitary
targetPort: 3000 # targetPort must match the port, the container is listening at. And service will forward the request to the TargetPort of the container

Headless

Client wants to communicate with 1 specific Pod directly. (ClusterIP will randomly select a pod.)

Use case: stateful applications, like databases.

We want to communicate with master pod (because it can write the statefulSet.)

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: v1
kind: Service
metadata:
name: mongodb-service-headless
spec:
clusterIP: None # It means that Headless
selector:
app: mongodb
ports:
- protocol: TCP
port: 27017
targetPort: 27017

NodePort

NodePort service is accessible on a static port on each worker node.

VS ClusterIP.

  • ClusterIP is accessible to the port of the cluster.
  • NodePort is accessible to the fixed port of the worker node.
1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
name: ms-service-nodeport
spec:
type: NodePort
selector:
app: microservice-one
ports:
- protocol: TCP
port: 3200 # cluster port
targetPort: 3000
nodePort: 30008 # node port

NodePort is not secure because it exposes the physical port of the node.

LoadBalancer

Kubernetes的三种外部访问方式

Kubernetes 私有集群 LoadBalancer 解决方案

Most cloud providers have their own loadBalancer.

Before we access the port of the node, we will first access the loadBalancer.

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
name: ms-service-nodeport
spec:
type: LoadBalancer
selector:
app: microservice-one
ports:
- protocol: TCP
port: 3200 # cluster port
targetPort: 3000
nodePort: 30008 # node port

LoadBalancer Service is an extension of NodePort Service

NodePort Service is an extension of ClusterIP Service

Kubernetes namespcaes

What is a Namespace?

  • Organise resources in namespaces
  • Virtual cluster inside a cluster

4 namespaces per default

en

  • Dashboard (useless)
  • System
    • Do create or modify in kube-system
    • System processes
    • Master and Kubectl process
  • Public
    • Publicely accesible data
    • A configmap, which contains cluster information
  • Node-lease
    • Heartbeats of nodes
    • Each node has associated lease object in namespace
    • Determine the availabilityof a node
  • Default
    • Resources you create are located here

Create a namespace with command line

1
kubectl create namespace my-namespace

What are the use cases?

Structure

Too many pods, deployments, services, configmaps. So you want to group them.

Such as dividing them into database, monitoring, elastic stack, nginx-ingress.

Conflicts

Many teams, same application

Resource Sharing

Staging and Development

Blue/Green Deployment: two version

Access and Resource Limits on Namespaces

How Namespace work and how to use it?

Check a componet in or not in namspaces.

1
2
kubectl api-resources --namespaced=false # not in namespace
kubectl api-resources --namespaced=true # not in namespace

Create components in your specified namespace

  • Command
1
kubectl apply -f mysql-configmap.yaml --namespace=my-namespace
  • Configuration
1
2
3
4
5
6
7
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql-configmap
namespace: my-namespace
data:
db_url: mysql-service.database

Check the components in specified namespace

1
get configmap -n my-namespace

If you want to change the active namespace, you can use kubens.kubens install kubens

1
2
3
4
5
6
7
8
9
10
11
kubens
#default
#kube-node-lease
#kube-public
#kube-system
#kubenetes-dashboard
#my-namespace

kubens my-namespace
#context "minikube" modified # cluster name
#Active namespace is "my-namespace".

Kubernetes Ingress

What is Ingress

en

For security and simplicity, we do not want users to visit service by port and ip address directly. So we introduce ingress

Configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
name: myapp-internal-service
spec:
selector:
app:
type: LoadBalancer #assign service an external IP address and so accepts external requests
ports:
- protocol: TCP
port: 8081
targetPort: 8080
nodePort: 35010 #port for external IP address and port you need to put into browser(30000-32767). And we can visit it by send request to http://ip:35010
1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name:
spec:
rules: # Routing rules
- host: myapp.com
http:
paths: #URL path
-backend:
serviceName: myapp-internal-service
servicePort: 8080

Ingress Controller (pod)

  • Evaluate all the rules
  • Manage redirections
  • Entrypoint to cluster
  • Many third-party implementations

K8s Nginx Ingress Controller

en

Multiple paths for same host

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name:
spec:
rules: # Routing rules
- host: myapp.com
http:
paths: #URL path
-path: /analytics #http://myapp.com/analytics
backend:
serviceName: analytics-service
servicePort: 3000
-path: /shopping #http://myapp.com/shopping
backend:
serviceName: myapp-internal-service
servicePort: 8080

Multiple sub-domains or domains

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name:
spec:
rules: # Routing rules
- host: analytics.myapp.com
http:
paths: #URL path
backend:
serviceName: analytics-service
servicePort: 3000
- host: shopping.myapp.com
http:
paths:
backend:
serviceName: myapp-internal-service
servicePort: 8080

Set https

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: tls-example-ingress
spec:
tls:
- hosts:
- myapp.com
secretName: myapp-secret-tls
rules: # Routing rules
- host: myapp.com
http:
paths: #URL path
-backend:
serviceName: myapp-internal-service
servicePort: 8080
1
2
3
4
5
6
7
8
9
apiVersion: v1
kind: Secret
metadata:
name: myapp-secret-tls
namespace: default
data:
tls.crt: base64 encoded cert
tls.key: base64 encoded key
type: kubernetes.io/tls

Kubernetes Volumes

Storage Requirements

  • Storage that does not depend on the pod lifecycle
  • Storage must be availabel on all nodes
  • Storage needs to survive even if cluster crashes

PV、PVC、StorageClass详解

3 concepts

  • PV

    是对底层网络共享存储的抽象,将共享存储定义为一种“资源”,比如Node也是容器应用可以消费的资源。PV由管理员创建和配置,与共享存储的具体实现直接相关。

  • PVC

    则是用户对存储资源的一个“申请”,就像Pod消费Node资源一样,PVC能够消费PV资源。PVC可以申请特定的存储空间和访问模式。

  • StorageClass

    用于标记存储资源的特性和性能,管理员可以将存储资源定义为某种类别,正如存储设备对于自身的配置描述(Profile)。根据StorageClass的描述可以直观的得知各种存储资源的特性,就可以根据应用对存储资源的需求去申请存储资源了。存储卷可以按需创建。

Persistent Volume (PV)

  • A cluster resources

    • Need actual physical storage, like local disk, NFS server, cloud storage
  • External plugin to your cluster

  • Created via yaml file

    • Kind: persistentVolume
    • Spec: e.g. How much storage?

PV作为存储资源,主要包括存储能力、访问模式、存储类型、回收策略、后端存储类型等关键信息的设置。

NFS server

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-name
spec:
capacity: #容量
storage: 5Gi
volumeMode: Filesystem #包括Filesystem(文件系统)和Block(块设备),默认值是FileSystem。
accessModes: #访问模式
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle #回收策略
storageClassName: slow
mountOptions:
- hard
- nfsvers=4.0
nfs:
path: /dir/path/on/nfs/server
server: nfs-server-ip-address

目前有以下PV类型支持块设备类型:

AWSElasticBlockStore、AzureDisk、FC、GCEPersistentDisk、iSCSI、Local volume、RBD(Ceph Block Device)、VsphereVolume(alpha)

  • volumeMode 属性设置为 Filesystem 的卷会被 Pod 挂载(Mount) 到某个目录。 如果卷的存储来自某块设备而该设备目前为空,Kuberneretes 会在第一次挂载卷之前 在设备上创建文件系统。
  • 你可以将 volumeMode 设置为 Block,以便将卷作为原始块设备来使用。 这类卷以块设备的方式交给 Pod 使用,其上没有任何文件系统。 这种模式对于为 Pod 提供一种使用最快可能方式来访问卷而言很有帮助,Pod 和 卷之间不存在文件系统层。另外,Pod 中运行的应用必须知道如何处理原始块设备。 关于如何在 Pod 中使用 volumeMode: Block 的卷,可参阅 原始块卷支持。

对PV进行访问模式的设置,用于描述用户的应用对存储资源的访问权限。访问模式如下:

  • ReadWriteOnce(RWO):读写权限,并且只能被单个Node挂载。

  • ReadOnlyMany(ROX):只读权限,允许被多个Node挂载。

  • ReadWriteMany(RWX):读写权限,允许被多个Node挂载。

Local storage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
apiVersion: v1
kind: persistentVolume
metadata:
name: example-pv
spec:
capacity:
storage: 100Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPlocy: Delete
storageClassname: local-storage
local:
path: /mnt/disks/ssd1
modeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: IN
values:
- example-node

kubernetes支持的PV类型如下:

  • AWSElasticBlockStore:AWS公有云提供的ElasticBlockStore。

  • AzureFile:Azure公有云提供的File。

  • AzureDisk:Azure公有云提供的Disk。

  • CephFS:一种开源共享存储系统。

  • FC(Fibre Channel):光纤存储设备。

  • FlexVolume:一种插件式的存储机制。

  • Flocker:一种开源共享存储系统。

  • GCEPersistentDisk:GCE公有云提供的PersistentDisk。

  • Glusterfs:一种开源共享存储系统。

  • HostPath:宿主机目录,仅用于单机测试。

  • iSCSI:iSCSI存储设备。

  • Local:本地存储设备,目前可以通过指定块(Block)设备提供Local PV,或通过社区开发的sig-storage-local-static-provisioner插件来管理Local PV的生命周期。

  • NFS:网络文件系统。

  • Portworx Volumes:Portworx提供的存储服务。

  • Quobyte Volumes:Quobyte提供的存储服务。

  • RBD(Ceph Block Device):Ceph块存储。

  • ScaleIO Volumes:DellEMC的存储设备。

  • StorageOS:StorageOS提供的存储服务。

  • VsphereVolume:VMWare提供的存储系统。

PV outside of the namespaces

It is accessible to the whole cluster

Persistent Volume Claim (PVC)

The pod will visit PV by forwarding the request by PVC

1
2
3
4
5
6
7
8
9
10
11
12
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-name
spec:
storageClassName: manual
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi

Use PVC in Pods configuration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
-name: mypd
persistentVolumeClaim:
claimName: pvc-name

Claims must be in the same namespace with the pod

When the claim find the persistent Volume, the volume will be mounted into the pod

Life Cycle of PV and PVC

en

将PV看作可用的存储资源,PVC则是对存储资源的需求。

  • 资源供应

    k8s支持两种资源的供应模式:静态模式(Static)和动态模式(Dynamic)。资源供应的结果就是创建好的PV。

    静态模式:集群管理员手工创建许多PV,在定义PV时需要将后端存储的特性进行设置。

    动态模式:集群管理员无需手工创建PV,而是通过StorageClass的设置对后端存储进行描述,标记为某种类型。此时要求PVC对存储的类型进行声明,系统将自动完成PV的创建及与PVC的绑定。PVC可以声明Class为"",说明该PVC禁止使用动态模式。

  • 资源绑定

    在定义好PVC之后,系统将根据PVC对存储资源的要求(存储空间和访问模式)在已存在的PV中选择一个满足PVC要求的PV,一旦找到,就将该PV与定义的PVC进行绑定,应用就可以使用这个PVC了。如果系统中没有这个PV,则PVC则会一直处理Pending状态,直到系统中有符合条件的PV。PV一旦绑定到PVC上,就会被PVC独占,不能再与其他PVC进行绑定。当PVC申请的存储空间比PV的少时,整个PV的空间就都能够为PVC所用,可能会造成资源的浪费。如果资源供应使用的是动态模式,则系统在为PVC找到合适的StorageClass后,将自动创建一个PV并完成与PVC的绑定。

  • 资源使用

    Pod使用Volume定义,将PVC挂载到容器内的某个路径进行使用。Volume的类型为Persistent VolumeClaim,在容器挂载了一个PVC后,就能被持续独占使用。多个Pod可以挂载到同一个PVC上。

  • 资源释放

    当存储资源使用完毕后,可以删除PVC,与该PVC绑定的PV会被标记为“已释放”,但还不能立刻与其他PVC进行绑定。通过之前PVC写入的数据可能还被保留在存储设备上,只有在清除之后该PV才能被再次使用。

    • 资源回收

    对于PV,管理员可以设定回收策略,用于设置与之绑定的PVC释放资源之后如何处理遗留数据的问题。只有PV的存储空间完成回收,才能供新的PVC绑定和使用。

通过两张图分别对在静态资源供应模式和动态资源供应模式下,PV、PVC、StorageClass及Pod使用PVC的原理进行说明。

在静态资源供应模式下,通过PV和PVC完成绑定,并供Pod使用的存储管理机制

在动态资源供应模式下,通过StorageClass和PVC完成资源动态绑定(系统自动生成PV),并供Pod使用的存储管理机制

Why so many abstractions?

  • Developers not care for the concret storage.
  • Do not want to set up the actual storages

others

ConfigMap and Secret

  • local volumes
  • Nor created via PV and PVC
  • Managed by K8s
  1. create ConfigMap or Secret component
  2. Mount that into your pod
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: busybox-container
image: busybox
volumeMounts:
- name: config-dir
mountPath: /etc/config
volumes:
- name: config-dir
configMap:
name: bb-configmap

StorageClass

StorageClass作为对存储资源的抽象定义,对用户设置的PVC申请屏蔽后端存储的细节,一方面减少了用户对存储资源细节的关注,另一方面减少了管理员手工管理PV的工作,由系统自动完成PV的创建和绑定,实现了动态的资源供应。

Configuration

1
2
3
4
5
6
7
8
9
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: storage-class-name
provisioner: kubernetes.io/aws-ebs #where to create the PV
parameters: # request for PV
type: io1
iopsPerGB: "10"
fsType: ext4

Abstraction under PVC

Usuage

Requested by PVC

1
2
3
4
5
6
7
8
9
10
11
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mypvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: storage-class-name

Pipeline

  • a. Pod claims storage via PVC
  • b. PVC requests storage from SC
  • c. SC creates PV that meets the needs of the claim

Kubernetes StatefulSet

StatefulSet for stateful applications

Stateful applications

  • Examples of stateful applications: database, applications that stores data
  • Deployed using StatefulSet

Stateless applications

  • Do not keep record of state
  • Each request is completely new
  • Deployed using Deployment

StatefulSet and Deployment both manage pods based on container specification!

Deployment vs StatefulSet

  • Deployment
    • Deployment is identical and interchangeable.
    • Deployment Created in random order with random hashes.
    • One service that load balance to any Pod
    • Lose data when all Pods die
  • StatefulSet
    • Can not be created/deleted at same time. (think about DB)
    • Can not be randomly addressed
    • Replica pods are not identical
      • Pod Identity
        • Sticky identity for each pod
        • Created from same specification, but not interchangeable!
        • Persistent identifier across any re-scheduling (when some pods died, some new pods will inherit their identity)
    • Can save the data when all Pod die

The pods will be divided into a master and workers. And only master can write the statefulSet and workers can only read it.

2 characteristics of StatefulSet

  • predictable pod name: mysql-0, mysql-1, …
  • fixed individual DNS name: mysql-0.svc2…

Kubernetes Role

Kubernetes角色访问控制RBAC和权限规则

User

kubernetes集群权限之Cluster、 User和Context

Role&ClusterRole

Role是一组权限的集合,例如Role可以包含列出Pod权限及列出Deployment权限,Role用于给某个NameSpace中的资源进行鉴权。

通过YAML资源定义清单创建Role

Role

针对单个namespace

1
2
3
4
5
6
7
8
9
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-role
rules:
- apiGroup: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]

ClusterRole

可以在包括所有namespace和集群级别的资源或非资源类型进行鉴权

1
2
3
4
5
6
7
8
9
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
namespace: default
name: pod-clusterrole
rules:
- apiGroup: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]

相关参数

Role、ClsuterRole APIGroup可配置参数

1
"","apps", "autoscaling", "batch"

参数含义为不同的api group,空字符串""表明使用core API group空字符串

Role、ClsuterRole Resource可配置参数

1
"services", "endpoints", "pods","secrets","configmaps","crontabs","deployments","jobs","nodes","rolebindings","clusterroles","daemonsets","replicasets","statefulsets","horizontalpodautoscalers","replicationcontrollers","cronjobs"

Role、ClsuterRole Verbs(规则)可配置参数

1
"get", "list", "watch", "create", "update", "patch", "delete", "exec"

RoleBinding&ClusterRoleBinding

角色绑定将一个角色中定义的各种权限授予一个或者一组用户,则该用户或用户组则具有对应绑定的Role或ClusterRole定义的权限。

角色绑定包含了一组相关主体(即subject, 包括用户——User、用户组——Group、或者服务账户——Service Account)以及对被授予角色的引用。 在某一namespace中可以通过RoleBinding对象授予权限,而集群范围的权限授予则通过ClusterRoleBinding对象完成。

利用RoleBinding绑定user在默认namespace里的权限

1
2
3
4
5
6
7
8
9
10
11
12
13
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: read-pods
namespace: default
subjects: #授权对象
- kind: User
name: Caden
apiGroup: rbac.authorization.k8s.io
roleRef: #role
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io

利用ClusterRoleBinding绑定group在集群里的权限

1
2
3
4
5
6
7
8
9
10
11
12
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: read-pods-global
subjects:
- kind: Group
name: pods-reader
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: pods-reader
apiGroup: rbac.authorization.k8s.io

利用RoleBinding引用一个ClusterRole,但仅作用在一个namespace

1
2
3
4
5
6
7
8
9
10
11
12
13
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: read-pods
namespace: development # 这里表明仅授权读取"development"namespace中的资源,若不定义该字段,则表示整个集群的Pod资源都可访问
subjects:
- kind: ServiceAccount
name: reader-dev
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: pod-reader
namespace: kube-system

Intructions

1
kubectl get role -n kube-system # 查看kube-system下的所有role

en

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
kubectl get role kube-proxy -n kube-system -o yaml #查看kube-proxy的具体参数信息
#apiVersion: rbac.authorization.k8s.io/v1
#kind: Role
#metadata:
# creationTimestamp: "2021-11-02T08:04:06Z"
# name: kube-proxy
# namespace: kube-system
# resourceVersion: "281"
# uid: f400dd8b-c7a2-4e5b-8449-020a307d1472
#rules:
#- apiGroups:
# - ""
# resourceNames:
# - kube-proxy
# resources:
# - configmaps
# verbs:
# - get
1
kubectl get rolebinding -n kube-system # 查看kube-system下的角色绑定

en

1
kubectl get clusterrole #查看集群中的clusterrole
1
kubectl get clusterrolebinding #查看集群中的绑定

Other functions

运行应用程序的多个实例(负载均衡)

ReplicaSet to balance

Avoid some pods being killed

  • NAME lists the names of the Deployments in the cluster.
  • READY shows the ratio of CURRENT/DESIRED replicas
  • UP-TO-DATE displays the number of replicas that have been updated to achieve the desired state.
  • AVAILABLE displays how many replicas of the application are available to your users.
  • AGE displays the amount of time that the application has been running.
1
kubectl get rs #see the ReplicaSet created by the Deployment

负载均衡

1
2
kubectl scale deployments/kubernetes-bootcamp --replicas=4 #创建4个副本
curl $(minikube ip):$NODE_PORT #多次运行这句话,我们会发现是不同的pod在响应

You do not need to control the replicas and you can set deployment.

Zone to balance

Balance the number of pods in different zones.

Pod Topology Spread Constraints

···YAML
kind: Pod
apiVersion: v1
metadata:
name: mypod
labels:
foo: bar
spec:
topologySpreadConstraints:

  • maxSkew: 1 #the difference among different topology is 1. cask principle: The shortest plank determines the capacity of the barrel
    topologyKey: zone # mark the zone (it also can be node, region etc.)
    whenUnsatisfiable: DoNotSchedule #do not satisfy the condition, then the pod will not be scheduled
    labelSelector:
    matchLabels:
    foo: bar
    containers:
  • name: pause
    image: k8s.gcr.io/pause:3.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

![en](zonebalance.png)

So mypod can only be placed into zoneB.

We can also group this topologySpredConstraints by following.

```YAML
kind: Pod
apiVersion: v1
metadata:
name: mypod
labels:
foo: bar
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
- maxSkew: 1
topologyKey: node
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
containers:
- name: pause
image: k8s.gcr.io/pause:3.1

有些节点暴露在外,我们不希望在之上部署pod

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
affinity: # do not schedule pod to zoneC
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: zone
operator: NotIn
values:
- zoneC
containers:
- name: pause
image: k8s.gcr.io/pause:3.1

The scheduling strategy is pod-based and we will introduce the cluster-based strategy.

It will involve the knowledge of scheduler in k8s.

1
kube-scheduler --config <filename>

The simple file can be this~

1
2
3
4
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
clientConnection:
kubeconfig: /etc/srv/kubernetes/kube-scheduler/kubeconfig

And we will discuss concrete knowledge later. In this part, we continue to discuss the balanced problem.

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration

profiles:
- pluginConfig:
- name: PodTopologySpread
args:
defaultConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
defaultingType: List

Now we do not need to set each pod configuration file.

执行滚动更新

1
2
3
kubectl set image deployments/kubernetes-bootcamp kubernetes-bootcamp=jocatalin/kubernetes-bootcamp:v2  #用新的版本更新旧的版本

kubectl rollout undo deployments/kubernetes-bootcamp #恢复到之前已知(或许是使用过?)的版本

Scheduling

Scheduling Policies

link

A scheduling Policy can be used to specify the predicates and priorities that the kube-scheduler runs to filter and score nodes, respectively.

1
2
3
kube-scheduler --policy-config-file <filename>
#or
kube-scheduler --policy-configmap <ConfigMap>

Predicates(断言)

简单来说:就是检查Node是否满足一些Pod的需要

The following predicates implement filtering:

  • PodFitsHostPorts: Checks if a Node has free ports (the network protocol kind) for the Pod ports the Pod is requesting.
  • PodFitsHost: Checks if a Pod specifies a specific Node by its hostname.
  • PodFitsResources: Checks if the Node has free resources (eg, CPU and Memory) to meet the requirement of the Pod.
  • MatchNodeSelector: Checks if a Pod’s Node Selector matches the Node’s label(s).
  • NoVolumeZoneConflict: Evaluate if the Volumes that a Pod requests are available on the Node, given the failure zone restrictions for that storage.
  • (Volumes: 存储数据的目录,容器可以访问)
  • NoDiskConflict: Evaluates if a Pod can fit on a Node due to the volumes it requests, and those that are already mounted.
  • MaxCSIVolumeCount: Decides how many CSI volumes should be attached, and whether that’s over a configured limit. (CSI: Contrainer Storage Interface)
  • PodToleratesNodeTaints: checks if a Pod’s tolerations can tolerate the Node’s taints.
  • (Pod:pod的容忍程度,与node的taint对应)
  • (Traints: is to prevent the scheduling of pods on nodes.)
  • CheckVolumeBinding: Evaluates if a Pod can fit due to the volumes it requests. This applies for both bound and unbound PVCs.

Priorities(优先级)

简单来说:就是根据node的一些资源或者要求进行排队

The following priorities implement scoring:

  • SelectorSpreadPriority: Spreads Pods across hosts, considering Pods that belong to the same Service, StatefulSet or ReplicaSet.
  • InterPodAffinityPriority: Implements preferred inter pod affininity and antiaffinity.
  • LeastRequestedPriority: Favors nodes with fewer requested resources. In other words, the more Pods that are placed on a Node, and the more resources those Pods use, the lower the ranking this policy will give.
  • MostRequestedPriority: Favors nodes with most requested resources. This policy will fit the scheduled Pods onto the smallest number of Nodes needed to run your overall set of workloads.
  • RequestedToCapacityRatioPriority: Creates a requestedToCapacity based ResourceAllocationPriority using default resource scoring function shape.
  • BalancedResourceAllocation: Favors nodes with balanced resource usage.
  • NodePreferAvoidPodsPriority: Prioritizes nodes according to the node annotation scheduler.alpha.kubernetes.io/preferAvoidPods. You can use this to hint that two different Pods shouldn’t run on the same Node.
  • NodeAffinityPriority: Prioritizes nodes according to node affinity scheduling preferences indicated in PreferredDuringSchedulingIgnoredDuringExecution. You can read more about this in Assigning Pods to Nodes.
  • TaintTolerationPriority: Prepares the priority list for all the nodes, based on the number of intolerable taints on the node. This policy adjusts a node’s rank taking that list into account.
  • ImageLocalityPriority: Favors nodes that already have the container images for that Pod cached locally.
  • ServiceSpreadingPriority: For a given Service, this policy aims to make sure that the Pods for the Service run on different nodes. It favours scheduling onto nodes that don’t have Pods for the service already assigned there. The overall outcome is that the Service becomes more resilient to a single Node failure.
  • EqualPriority: Gives an equal weight of one to all nodes.
  • EvenPodsSpreadPriority: Implements preferred pod topology spread constraints.

Scheduler Configuration

A scheduling Profile allows you to configure the different stages of scheduling in the kube-scheduler.

Each stage is exposed in an extension point.

Extension points

Scheduling happens in a series of stages that are exposed through the following extension points:

  1. queueSort: These plugins provide an ordering function that is used to sort pending Pods in the scheduling queue. Exactly one queue sort plugin may be enabled at a time.
  2. preFilter: These plugins are used to pre-process or check information about a Pod or the cluster before filtering. They can mark a pod as unschedulable.
  3. filter: These plugins are the equivalent of Predicates in a scheduling Policy and are used to filter out nodes that can not run the Pod. Filters are called in the configured order. A pod is marked as unschedulable if no nodes pass all the filters.
  4. postFilter: These plugins are called in their configured order when no feasible nodes were found for the pod. If any postFilter plugin marks the Pod schedulable, the remaining plugins are not called.
  5. preScore: This is an informational extension point that can be used for doing pre-scoring work.
  6. score: These plugins provide a score to each node that has passed the filtering phase. The scheduler will then select the node with the highest weighted scores sum.
  7. reserve: This is an informational extension point that notifies plugins when resources have been reserved for a given Pod. Plugins also implement an Unreserve call that gets called in the case of failure during or after Reserve.
  8. permit: These plugins can prevent or delay the binding of a Pod.
  9. preBind: These plugins perform any work required before a Pod is bound.
  10. bind: The plugins bind a Pod to a Node. bind plugins are called in order and once one has done the binding, the remaining plugins are skipped. At least one bind plugin is required.
  11. postBind: This is an informational extension point that is called after a Pod has been bound.

Helm

What is Helm

Package manager for kubernetes

to package yaml file

When you have too many components, it is tedious to manage them and search for them.

And other people can use the yaml file in the repository

Templating Engine

If you want to create many microservice and they just have a few difference, you do not need to write different yaml file.

You can:

  • define a common blueprint
  • Dynamic values are replaced by placeholders

Template YAML config

1
2
3
4
5
6
7
8
9
apiVersion: v1
kind: Pod
metadata:
name: {{ .Values.name }}
spec:
containers:
- name: {{ .Values.container.name }}
image: {{ .Values.container.image }}
port: {{ .Values.container.port }}

values.yaml

1
2
3
4
5
name: my-app
container:
name: my-app-container
image: my-app-image
port: 9001

How to use them

When you install a chart, you will use the values.yaml to inject the template file.

And if you want to use you own values file to update some values in values.yaml, you can

1
helm install --values=my-values.yaml <chartname>

You can also set some values by command line.

1
helm install --set version=2.0.0

If you like this blog or find it useful for you, you are welcome to comment on it. You are also welcome to share this blog, so that more people can participate in it. If the images used in the blog infringe your copyright, please contact the author to delete them. Thank you !