背景简介
Kubernetes
安装详情,本次kubernetes
安装采用官方支持的用于部署工具是 kubeadm
环境配置
- 系统:Debian 6.1.66-1 (2023-12-09) x86_64 GNU/Linux
- Containerd: v1.6.27
- Kubernetes: 1.29
详细步骤
容器运行时选择
由于 Kubernetes v1.24
版本之后不再支持适配 Docker Engine
的组件 dockershim
,所以本次安装采用的是 Containerd
作为容器运行时。
Containered 安装笔记:Installation
kubernetes 组件
你需要在每台机器上安装以下的软件包:
- kubeadm:用来初始化集群的指令。
- kubelet:在集群中的每个节点上用来启动 Pod 和容器等。
- kubectl:用来与集群通信的命令行工具。
kubeadm 不能帮你安装或者管理 kubelet 或 kubectl, 所以你需要确保它们与通过 kubeadm 安装的控制平面的版本相匹配。 如果不这样做,则存在发生版本偏差的风险,可能会导致一些预料之外的错误和问题。 然而,控制平面与 kubelet 之间可以存在一个次要版本的偏差,但 kubelet 的版本不可以超过 API 服务器的版本。 例如,1.7.0 版本的 kubelet 可以完全兼容 1.8.0 版本的 API 服务器,反之则不可以。
安装和配置先决条件
cgroup 驱动
在 Linux 上,控制组(CGroup)用于限制分配给进程的资源。
kubelet 和底层容器运行时都需要对接控制组来强制执行 为 Pod 和容器管理资源 并为诸如 CPU、内存这类资源设置请求和限制。若要对接控制组,kubelet 和容器运行时需要使用一个 cgroup 驱动。 关键的一点是 kubelet 和容器运行时需使用相同的 cgroup 驱动并且采用相同的配置。
修改 Containerd
的 cgroup
驱动为 systemd
myserver@peag-k8s-master:~$ sudo nano /etc/containerd/config.toml
......
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
......
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
BinaryName = ""
CriuImagePath = ""
CriuPath = ""
CriuWorkPath = ""
IoGid = 0
IoUid = 0
NoNewKeyring = false
NoPivotRoot = false
Root = ""
ShimCgroup = ""
SystemdCgroup = true
......
将
SystemdCgroup
参数配置为true
安装 kubeadm, kubelet, kubectl
禁用交换分区
交换分区的配置。kubelet 的默认行为是在节点上检测到交换内存时无法启动。 kubelet 自 v1.22 起已开始支持交换分区。自 v1.28 起,仅针对 cgroup v2 支持交换分区; kubelet 的 NodeSwap 特性门控处于 Beta 阶段,但默认被禁用。
查看交换分区状态,指令:free -h
myserver@peag-k8s-master:~$ sudo free -h
total used free shared buff/cache available
Mem: 3.8Gi 462Mi 149Mi 3.6Mi 3.5Gi 3.4Gi
Swap: 974Mi 780Ki 974Mi
如无
Swap
相关信息则代表交换分区已禁用,可跳过此步骤。
查看交换分区位置,指令:blkid
myserver@peag-k8s-master:~$ sudo blkid
/dev/vda5: UUID="7e8b33f3-b940-4d8e-bcb6-474ed3f58c0d" TYPE="swap" PARTUUID="d56debd6-05"
/dev/vda1: UUID="b4f5b969-2634-4eb8-8fd0-2c28fbb2cac6" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="d56debd6-01"
禁用交换分区,修改配置文件:/etc/fstab
,将交换分区所在配置行注释即可,例如:# UUID=7e8b33f3-b940-4d8e-bcb6-474ed3f58c0d none swap sw 0 0
myserver@peag-k8s-master:~$ sudo nano /etc/fstab
myserver@peag-k8s-master:~$ cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# systemd generates mount units based on this file, see systemd.mount(5).
# Please run 'systemctl daemon-reload' after making changes here.
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/vda1 during installation
UUID=b4f5b969-2634-4eb8-8fd0-2c28fbb2cac6 / ext4 errors=remount-ro 0 1
# swap was on /dev/vda5 during installation
# UUID=7e8b33f3-b940-4d8e-bcb6-474ed3f58c0d none swap sw 0 0
/dev/sr0 /media/cdrom0 udf,iso9660 user,noauto 0 0
重启系统使配置生效,指令:reboot
myserver@peag-k8s-master:~$ sudo reboot
Broadcast message from root@peag-k8s-master on pts/1 (Fri 2024-01-19 11:45:29 HKT):
The system will reboot now!
重启完成后验证交换分区状态,指令:free -h
myserver@peag-k8s-master:~$ sudo free -h
total used free shared buff/cache available
Mem: 3.8Gi 331Mi 3.5Gi 3.6Mi 148Mi 3.5Gi
Swap: 0B 0B 0B
配置 apt
更新 apt
包索引并安装使用 Kubernetes apt
仓库所需要的包
myserver@peag-k8s-master:~$ sudo apt update
myserver@peag-k8s-master:~$ sudo apt install -y apt-transport-https ca-certificates curl gpg
下载用于 Kubernetes
软件包仓库的公共签名密钥。所有仓库都使用相同的签名密钥,因此你可以忽略URL
中的版本:
myserver@peag-k8s-master:~$ curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
添加 Kubernetes apt
仓库。 请注意,此仓库仅包含适用于 Kubernetes 1.29
的软件包。
myserver@peag-k8s-master:~$ echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
安装
更新 apt
包索引,安装 kubelet
、kubeadm
和 kubectl
,并锁定其版本
myserver@peag-k8s-master:~$ sudo apt update
myserver@peag-k8s-master:~$ sudo apt install -y kubelet kubeadm kubectl
myserver@peag-k8s-master:~$ sudo apt-mark hold kubelet kubeadm kubectl
apt-mark hold
是 Debian 及其派生发行版(如 Ubuntu)中 apt 包管理工具的一个命令,它用于将指定的软件包标记为持有(hold)。这意味着这些软件包在未来的自动升级过程中不会被更新。这个命令对于那些你不希望自动更新的软件包非常有用,比如那些稳定性重要或者已经经过定制的软件包。
查看安装结果
myserver@peag-k8s-master:~$ sudo kubectl version
Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
The connection to the server localhost:8080 was refused - did you specify the right host or port?
myserver@peag-k8s-master:~$ sudo kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.1", GitCommit:"bc401b91f2782410b3fb3f9acf43a995c4de90d2", GitTreeState:"clean", BuildDate:"2024-01-17T15:49:02Z", GoVersion:"go1.21.6", Compiler:"gc", Platform:"linux/amd64"}
myserver@peag-k8s-master:~$ sudo kubelet --version
Kubernetes v1.29.1
将 SELinux
设置为 permissive
模式(相当于将其禁用)。
myserver@peag-k8s-master:~$ sudo setenforce 0
myserver@peag-k8s-master:~$ sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
Note: 如果出现
sudo: setenforce: command not found
报错,则selinux
未安装,可忽略此步骤。
通过运行命令 setenforce 0 和 sed ... 将 SELinux 设置为 permissive 模式相当于将其禁用。 这是允许容器访问主机文件系统所必需的,例如,某些容器网络插件需要这一能力。 你必须这么做,直到 kubelet 改进其对 SELinux 的支持。
使用kubeadm 创建集群 - Master 节点
准备工作
- 将
ip_forward
设置为1
myserver@peag-k8s-master:~$ sudo sysctl -w net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
myserver@peag-k8s-master:~$ cat /proc/sys/net/ipv4/ip_forward
1
- 配置
bridge-nf-call-iptables
值为1
myserver@peag-k8s-master:~$ sudo sysctl -w net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-iptables = 1
如果报错信息为:
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
则说明iptables
未开启桥接网络过滤功能。请参考: 启用桥接网络过滤功能
开始初始化
初始化前,需确认本地环境是否能够访问
registry.k8s.io
, 如不能访问,需配置国内镜像站加速。
更新配置信息指令:sudo kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
myserver@peag-k8s-master:~$ sudo kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.29.1
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.29.1
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.29.1
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.29.1
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.11.1
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.10-0
初始化节点,指令:kubeadm init <args>
请使用
root
用户进行初始化,否则可能会出现permission denied
的报错。
myserver@peag-k8s-master:~ $ kubeadm init --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.29.1
- --pod-network-cidr: 指定 Pod 网络的 CIDR。默认值是 10.244.0.0/16。
- --service-cidr=10.96.0.0/12:指定 Service 网络的 CIDR。默认值是 10.96.0.0/12。
- --image-repository=registry.aliyuncs.com/google_containers:指定镜像仓库。
- --kubernetes-version=v1.29.1 :指定要部署的 Kubernetes 版本。
初始化完成信息
......
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.122.105:6443 --token feecq5.fbbocvez6v854bj3 \
--discovery-token-ca-cert-hash sha256:c25c4c07624eb13db94236093f57d33dd8c6d591f1d61fcffcf83dba0df71378
根据提示,更新配置
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
查看节点信息
myserver@peag-k8s-master:~$ kubectl get node -A
NAME STATUS ROLES AGE VERSION
peag-k8s-master NotReady control-plane 45m v1.29.1
网络组件为安装,因此为 NotReady的状态。
安装网络组件
根据 kubeadm
初始化完成后的提示,在官方网站根据自己的需求选择网络组件并安装。
URL: https://kubernetes.io/docs/concepts/cluster-administration/addons/
本次使用 Flannel 作为网络组件,一键部署指令:
myserver@peag-k8s-master:~$ kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
也可将 kube-flannel.yml 配置文件下载至本地后,使用 kubectl apply -f kube-flannel.yml 部署。
myserver@peag-k8s-master:~$ kubectl apply -f kube-flannel.yml
namespace/kube-flannel created
serviceaccount/flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
查看部署结果
指令:kubectl get nodes,pod -A
myserver@peag-k8s-master:~$ kubectl get nodes,pod -A
NAME STATUS ROLES AGE VERSION
node/peag-k8s-master Ready control-plane 3h44m v1.29.1
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel pod/kube-flannel-ds-6nf87 1/1 Running 0 2m10s
kube-system pod/coredns-5f98f8d567-h5bd9 1/1 Running 0 3h44m
kube-system pod/coredns-5f98f8d567-v8nt4 1/1 Running 0 3h44m
kube-system pod/etcd-peag-k8s-master 1/1 Running 2 (3h7m ago) 3h44m
kube-system pod/kube-apiserver-peag-k8s-master 1/1 Running 2 (3h7m ago) 3h44m
kube-system pod/kube-controller-manager-peag-k8s-master 1/1 Running 2 (3h7m ago) 3h44m
kube-system pod/kube-proxy-9dtpn 1/1 Running 1 (3h7m ago) 3h44m
kube-system pod/kube-scheduler-peag-k8s-master 1/1 Running 2 (3h7m ago) 3h44m
问题&解决方案
问题一: ip_forward contents are not set to 1
[init] Using Kubernetes version: v1.29.1
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决方案:参考准备工作内容
- 配置
bridge-nf-call-iptables
值为1
- 将
ip_forward
设置为1
问题二:couldn't get current server API group list
myserver@peag-k8s-master:~$ sudo kubectl get pod -A
E0122 11:48:06.079169 1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0122 11:48:06.080385 1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0122 11:48:06.082184 1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0122 11:48:06.083661 1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0122 11:48:06.084749 1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?
问题原因:kubelet权限配置问题,请根据 kubeadm
初始化完成后的提示配置
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
问题三:节点为 NotReady
myserver@peag-k8s-master:~$ kubectl get node -A
NAME STATUS ROLES AGE VERSION
peag-k8s-master NotReady control-plane 45m v1.29.1
myserver@peag-k8s-master:~$ sudo journalctl -f -u kubelet.service
Jan 22 11:58:51 peag-k8s-master kubelet[518]: E0122 11:58:51.955051 518 kubelet.go:2892] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"
问题原因:网络组件未安装,因此无法启动,解决方案请参考笔记中的网络组件安装部分。
问题四:failed to get sandbox image
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0122 16:26:36.872538 2885 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.aliyuncs.com/google_containers/pause:3.9" as the CRI sandbox image.
问题原因:containerd 的沙盒镜像和 k8s 的未保持一致,containerd为 registry.k8s.io/pause:3.6
, 但 k8s 为 registry.aliyuncs.com/google_containers/pause:3.9
。
解决方案:更新 containerd 的沙盒镜像。
root@peag-k8s-worker:/etc/kubernetes# nano /etc/containerd/config.toml
root@peag-k8s-worker:/etc/kubernetes# cat /etc/containerd/config.toml
......
[plugins."io.containerd.grpc.v1.cri"]
device_ownership_from_security_context = false
disable_apparmor = false
disable_cgroup = false
disable_hugetlb_controller = true
disable_proc_mount = false
disable_tcp_service = true
enable_selinux = false
enable_tls_streaming = false
enable_unprivileged_icmp = false
enable_unprivileged_ports = false
ignore_image_defined_volumes = false
max_concurrent_downloads = 3
max_container_log_line_size = 16384
netns_mounts_under_state_dir = false
restrict_oom_score_adj = false
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
selinux_category_range = 1024
stats_collect_period = 10
stream_idle_timeout = "4h0m0s"
stream_server_address = "127.0.0.1"
stream_server_port = "0"
systemd_cgroup = false
tolerate_missing_hugetlb_controller = true
unset_seccomp_profile = ""
......
systemctl restart containerd
worker 节点加入 master 集群
根据 kubeadm init
提示指令,在 worker
节点直接输入 kubeadm join
命令即可。
myserver@peag-k8s-worker:~$ sudo kubeadm join 192.168.122.105:6443 --token feecq5.fbbocvez6v854bj3 \
--discovery-token-ca-cert-hash sha256:c25c4c07624eb13db94236093f57d33dd8c6d591f1d61fcffcf83dba0df71378
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
在主节点查看节点信息
myserver@peag-k8s-master:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
peag-k8s-master Ready control-plane 5h53m v1.29.1
peag-k8s-worker Ready <none> 16s v1.29.1