Henry
发布于 2024-01-22 / 120 阅读
1
0

kubernetes 集群安装

背景简介

Kubernetes 安装详情,本次kubernetes 安装采用官方支持的用于部署工具是 kubeadm

环境配置

  1. 系统:Debian 6.1.66-1 (2023-12-09) x86_64 GNU/Linux
  2. Containerd: v1.6.27
  3. Kubernetes: 1.29 

详细步骤

容器运行时选择

由于 Kubernetes v1.24 版本之后不再支持适配 Docker Engine 的组件 dockershim,所以本次安装采用的是 Containerd 作为容器运行时。 

Containered 安装笔记:Installation

kubernetes 组件

你需要在每台机器上安装以下的软件包:

  • kubeadm:用来初始化集群的指令。
  • kubelet:在集群中的每个节点上用来启动 Pod 和容器等。
  • kubectl:用来与集群通信的命令行工具。

kubeadm 不能帮你安装或者管理 kubelet 或 kubectl, 所以你需要确保它们与通过 kubeadm 安装的控制平面的版本相匹配。 如果不这样做,则存在发生版本偏差的风险,可能会导致一些预料之外的错误和问题。 然而,控制平面与 kubelet 之间可以存在一个次要版本的偏差,但 kubelet 的版本不可以超过 API 服务器的版本。 例如,1.7.0 版本的 kubelet 可以完全兼容 1.8.0 版本的 API 服务器,反之则不可以。

安装和配置先决条件 

cgroup 驱动

在 Linux 上,控制组(CGroup)用于限制分配给进程的资源。

kubelet 和底层容器运行时都需要对接控制组来强制执行 为 Pod 和容器管理资源 并为诸如 CPU、内存这类资源设置请求和限制。若要对接控制组,kubelet 和容器运行时需要使用一个 cgroup 驱动。 关键的一点是 kubelet 和容器运行时需使用相同的 cgroup 驱动并且采用相同的配置。

修改 Containerdcgroup 驱动为 systemd

myserver@peag-k8s-master:~$ sudo nano /etc/containerd/config.toml 

......

      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]

        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]


		......
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            BinaryName = ""
            CriuImagePath = ""
            CriuPath = ""
            CriuWorkPath = ""
            IoGid = 0
            IoUid = 0
            NoNewKeyring = false
            NoPivotRoot = false
            Root = ""
            ShimCgroup = ""
            SystemdCgroup = true
            
            ......

SystemdCgroup 参数配置为 true

安装 kubeadm, kubelet, kubectl

禁用交换分区

交换分区的配置。kubelet 的默认行为是在节点上检测到交换内存时无法启动。 kubelet 自 v1.22 起已开始支持交换分区。自 v1.28 起,仅针对 cgroup v2 支持交换分区; kubelet 的 NodeSwap 特性门控处于 Beta 阶段,但默认被禁用。

查看交换分区状态,指令:free -h

myserver@peag-k8s-master:~$ sudo free -h
               total        used        free      shared  buff/cache   available
Mem:           3.8Gi       462Mi       149Mi       3.6Mi       3.5Gi       3.4Gi
Swap:          974Mi       780Ki       974Mi

如无 Swap 相关信息则代表交换分区已禁用,可跳过此步骤。

查看交换分区位置,指令:blkid

myserver@peag-k8s-master:~$ sudo blkid
/dev/vda5: UUID="7e8b33f3-b940-4d8e-bcb6-474ed3f58c0d" TYPE="swap" PARTUUID="d56debd6-05"
/dev/vda1: UUID="b4f5b969-2634-4eb8-8fd0-2c28fbb2cac6" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="d56debd6-01"

禁用交换分区,修改配置文件:/etc/fstab,将交换分区所在配置行注释即可,例如:# UUID=7e8b33f3-b940-4d8e-bcb6-474ed3f58c0d none            swap    sw              0       0

myserver@peag-k8s-master:~$ sudo nano /etc/fstab 
myserver@peag-k8s-master:~$ cat /etc/fstab 
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# systemd generates mount units based on this file, see systemd.mount(5).
# Please run 'systemctl daemon-reload' after making changes here.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/vda1 during installation
UUID=b4f5b969-2634-4eb8-8fd0-2c28fbb2cac6 /               ext4    errors=remount-ro 0       1
# swap was on /dev/vda5 during installation
# UUID=7e8b33f3-b940-4d8e-bcb6-474ed3f58c0d none            swap    sw              0       0
/dev/sr0        /media/cdrom0   udf,iso9660 user,noauto     0       0

重启系统使配置生效,指令:reboot

myserver@peag-k8s-master:~$ sudo reboot

Broadcast message from root@peag-k8s-master on pts/1 (Fri 2024-01-19 11:45:29 HKT):

The system will reboot now!

重启完成后验证交换分区状态,指令:free -h

myserver@peag-k8s-master:~$ sudo free -h

               total        used        free      shared  buff/cache   available
Mem:           3.8Gi       331Mi       3.5Gi       3.6Mi       148Mi       3.5Gi
Swap:             0B          0B          0B

配置 apt 

更新 apt 包索引并安装使用 Kubernetes apt 仓库所需要的包

myserver@peag-k8s-master:~$ sudo apt update
myserver@peag-k8s-master:~$ sudo apt install -y apt-transport-https ca-certificates curl gpg

下载用于 Kubernetes 软件包仓库的公共签名密钥。所有仓库都使用相同的签名密钥,因此你可以忽略URL中的版本:

myserver@peag-k8s-master:~$ curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

添加 Kubernetes apt 仓库。 请注意,此仓库仅包含适用于 Kubernetes 1.29 的软件包。

myserver@peag-k8s-master:~$ echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

安装 

更新 apt 包索引,安装 kubeletkubeadm  和 kubectl,并锁定其版本

myserver@peag-k8s-master:~$ sudo apt update
myserver@peag-k8s-master:~$ sudo apt install -y kubelet kubeadm kubectl
myserver@peag-k8s-master:~$ sudo apt-mark hold kubelet kubeadm kubectl

apt-mark hold 是 Debian 及其派生发行版(如 Ubuntu)中 apt 包管理工具的一个命令,它用于将指定的软件包标记为持有(hold)。这意味着这些软件包在未来的自动升级过程中不会被更新。这个命令对于那些你不希望自动更新的软件包非常有用,比如那些稳定性重要或者已经经过定制的软件包。

查看安装结果

myserver@peag-k8s-master:~$ sudo kubectl version
Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
The connection to the server localhost:8080 was refused - did you specify the right host or port?
myserver@peag-k8s-master:~$ sudo kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.1", GitCommit:"bc401b91f2782410b3fb3f9acf43a995c4de90d2", GitTreeState:"clean", BuildDate:"2024-01-17T15:49:02Z", GoVersion:"go1.21.6", Compiler:"gc", Platform:"linux/amd64"}
myserver@peag-k8s-master:~$ sudo kubelet --version
Kubernetes v1.29.1

SELinux 设置为 permissive 模式(相当于将其禁用)。

myserver@peag-k8s-master:~$ sudo setenforce 0
myserver@peag-k8s-master:~$ sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

Note: 如果出现  sudo: setenforce: command not found 报错,则 selinux 未安装,可忽略此步骤。

通过运行命令 setenforce 0 和 sed ... 将 SELinux 设置为 permissive 模式相当于将其禁用。 这是允许容器访问主机文件系统所必需的,例如,某些容器网络插件需要这一能力。 你必须这么做,直到 kubelet 改进其对 SELinux 的支持。

使用kubeadm 创建集群 - Master 节点

准备工作

  • ip_forward 设置为 1
myserver@peag-k8s-master:~$ sudo sysctl -w net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1
myserver@peag-k8s-master:~$ cat /proc/sys/net/ipv4/ip_forward
1
  • 配置 bridge-nf-call-iptables 值为 1
myserver@peag-k8s-master:~$ sudo sysctl -w net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-iptables = 1

如果报错信息为:sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory 则说明 iptables未开启桥接网络过滤功能。请参考: 启用桥接网络过滤功能

开始初始化

初始化前,需确认本地环境是否能够访问 registry.k8s.io, 如不能访问,需配置国内镜像站加速。

更新配置信息指令:sudo kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers

myserver@peag-k8s-master:~$ sudo kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.29.1
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.29.1
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.29.1
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.29.1
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.11.1
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.9
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.10-0

初始化节点,指令:kubeadm init <args>

请使用 root 用户进行初始化,否则可能会出现 permission denied 的报错。

myserver@peag-k8s-master:~ $ kubeadm init --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.29.1  
  • --pod-network-cidr: 指定 Pod 网络的 CIDR。默认值是 10.244.0.0/16。
  • --service-cidr=10.96.0.0/12:指定 Service 网络的 CIDR。默认值是 10.96.0.0/12。
  • --image-repository=registry.aliyuncs.com/google_containers:指定镜像仓库。
  • --kubernetes-version=v1.29.1 :指定要部署的 Kubernetes 版本。

初始化完成信息

	......
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.122.105:6443 --token feecq5.fbbocvez6v854bj3 \
        --discovery-token-ca-cert-hash sha256:c25c4c07624eb13db94236093f57d33dd8c6d591f1d61fcffcf83dba0df71378
        

根据提示,更新配置

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

查看节点信息

myserver@peag-k8s-master:~$ kubectl get node -A
NAME              STATUS     ROLES           AGE   VERSION
peag-k8s-master   NotReady   control-plane   45m   v1.29.1

网络组件为安装,因此为 NotReady的状态。

安装网络组件

根据 kubeadm 初始化完成后的提示,在官方网站根据自己的需求选择网络组件并安装。

URL: https://kubernetes.io/docs/concepts/cluster-administration/addons/

本次使用 Flannel 作为网络组件,一键部署指令:

myserver@peag-k8s-master:~$ kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

也可将 kube-flannel.yml 配置文件下载至本地后,使用 kubectl apply -f kube-flannel.yml 部署。

myserver@peag-k8s-master:~$ kubectl apply -f kube-flannel.yml 
namespace/kube-flannel created
serviceaccount/flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created

查看部署结果

指令:kubectl get nodes,pod -A

myserver@peag-k8s-master:~$ kubectl get nodes,pod -A
NAME                   STATUS   ROLES           AGE     VERSION
node/peag-k8s-master   Ready    control-plane   3h44m   v1.29.1

NAMESPACE      NAME                                          READY   STATUS    RESTARTS       AGE
kube-flannel   pod/kube-flannel-ds-6nf87                     1/1     Running   0              2m10s
kube-system    pod/coredns-5f98f8d567-h5bd9                  1/1     Running   0              3h44m
kube-system    pod/coredns-5f98f8d567-v8nt4                  1/1     Running   0              3h44m
kube-system    pod/etcd-peag-k8s-master                      1/1     Running   2 (3h7m ago)   3h44m
kube-system    pod/kube-apiserver-peag-k8s-master            1/1     Running   2 (3h7m ago)   3h44m
kube-system    pod/kube-controller-manager-peag-k8s-master   1/1     Running   2 (3h7m ago)   3h44m
kube-system    pod/kube-proxy-9dtpn                          1/1     Running   1 (3h7m ago)   3h44m
kube-system    pod/kube-scheduler-peag-k8s-master            1/1     Running   2 (3h7m ago)   3h44m

问题&解决方案

问题一: ip_forward contents are not set to 1
[init] Using Kubernetes version: v1.29.1
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
        [ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

解决方案:参考准备工作内容

  • 配置 bridge-nf-call-iptables 值为 1
  • ip_forward 设置为 1
问题二:couldn't get current server API group list
myserver@peag-k8s-master:~$ sudo kubectl get pod -A
E0122 11:48:06.079169    1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0122 11:48:06.080385    1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0122 11:48:06.082184    1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0122 11:48:06.083661    1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0122 11:48:06.084749    1152 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?

问题原因:kubelet权限配置问题,请根据 kubeadm 初始化完成后的提示配置

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf
问题三:节点为 NotReady   
myserver@peag-k8s-master:~$ kubectl get node -A
NAME              STATUS     ROLES           AGE   VERSION
peag-k8s-master   NotReady   control-plane   45m   v1.29.1
myserver@peag-k8s-master:~$ sudo journalctl -f -u kubelet.service
Jan 22 11:58:51 peag-k8s-master kubelet[518]: E0122 11:58:51.955051     518 kubelet.go:2892] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"

问题原因:网络组件未安装,因此无法启动,解决方案请参考笔记中的网络组件安装部分。

问题四:failed to get sandbox image
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0122 16:26:36.872538    2885 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.aliyuncs.com/google_containers/pause:3.9" as the CRI sandbox image.

问题原因:containerd 的沙盒镜像和 k8s 的未保持一致,containerd为 registry.k8s.io/pause:3.6 , 但 k8s 为 registry.aliyuncs.com/google_containers/pause:3.9

解决方案:更新 containerd 的沙盒镜像。

root@peag-k8s-worker:/etc/kubernetes# nano /etc/containerd/config.toml
root@peag-k8s-worker:/etc/kubernetes# cat /etc/containerd/config.toml

		......
  [plugins."io.containerd.grpc.v1.cri"]
    device_ownership_from_security_context = false
    disable_apparmor = false
    disable_cgroup = false
    disable_hugetlb_controller = true
    disable_proc_mount = false
    disable_tcp_service = true
    enable_selinux = false
    enable_tls_streaming = false
    enable_unprivileged_icmp = false
    enable_unprivileged_ports = false
    ignore_image_defined_volumes = false
    max_concurrent_downloads = 3
    max_container_log_line_size = 16384
    netns_mounts_under_state_dir = false
    restrict_oom_score_adj = false
    sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
    selinux_category_range = 1024
    stats_collect_period = 10
    stream_idle_timeout = "4h0m0s"
    stream_server_address = "127.0.0.1"
    stream_server_port = "0"
    systemd_cgroup = false
    tolerate_missing_hugetlb_controller = true
    unset_seccomp_profile = ""
    
    ......
systemctl restart containerd

worker 节点加入 master 集群

根据 kubeadm init 提示指令,在 worker节点直接输入 kubeadm join 命令即可。

myserver@peag-k8s-worker:~$ sudo kubeadm join 192.168.122.105:6443 --token feecq5.fbbocvez6v854bj3 \
        --discovery-token-ca-cert-hash sha256:c25c4c07624eb13db94236093f57d33dd8c6d591f1d61fcffcf83dba0df71378
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

在主节点查看节点信息

myserver@peag-k8s-master:~$ kubectl get nodes
NAME              STATUS   ROLES           AGE     VERSION
peag-k8s-master   Ready    control-plane   5h53m   v1.29.1
peag-k8s-worker   Ready    <none>          16s     v1.29.1

评论