规划
1、服务器配置
OS | 配置 | 用途 |
---|
CentOS 7.9(172.27.0.13) | 2C/4G | k8s-master | CentOS 7.9(172.27.0.10) | 2C/4G | k8s-work1 | CentOS 7.9(172.27.0.11) | 2C/4G | k8s-work2 |
注:这是演示 k8s 集群安装的实验环境,配置较低,生产环境中我们的服务器配置至少都是 8C/16G 的基础配置。
2、版本选择
- CentOS:7.9+
- k8s组件版本:1.23.6(当前最新)
一、服务器基础配置
1、配置主机名
所有节点执行
[root@server ~]
[root@server ~]
[root@server ~]
2、关闭防火墙
所有节点执行
[root@k8s-master ~]
[root@k8s-master ~]
[root@k8s-master ~]
3、互做本地解析
所有节点执行
[root@k8s-master ~]
172.27.0.13 k8s-master
172.27.0.9 k8s-work1
172.27.0.10 k8s-work2
4、SSH 免密通信(可选)
所有节点执行
[root@k8s-master ~]
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:qmMRk/pyFrxMRCqzeko/fPbVBzPYz1Em4u5cNR7dvzs root@k8s-master
The key's randomart image is:
+---[RSA 2048]----+
| |
| . |
| o . . . o |
|o . = + . + o|
| + + o S. * . +o|
|. . = . o * + +|
|...+ +. . o = ..|
|o +oO+ . o o E.|
|.o *=... o oo|
+----[SHA256]-----+
所有节点执行(互发公钥)
[root@k8s-master ~]
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host '172.27.0.9 (172.27.0.9)' can't be established.
ECDSA key fingerprint is SHA256:IzYTCZWXEv8rTdYYx+RdTyi+EJF2Jqggz0pT5v/oZwk.
ECDSA key fingerprint is MD5:d0:89:66:b8:73:d0:eb:3b:19:cb:b2:3c:82:d0:a5:ff.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@172.27.0.9's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@172.27.0.9'"
and check to make sure that only the key(s) you wanted were added.
5、加载 br_netfilter 模块
确保 br_netfilter 模块被加载 所有节点执行
[root@k8s-master ~]
[root@k8s-master ~]
br_netfilter 22256 0
bridge 151336 1 br_netfilter
cat <<EOF | tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
6、允许 iptables 检查桥接流量
所有节点执行
[root@k8s-master ~]
br_netfilter
EOF
[root@k8s-master ~]
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
[root@k8s-master ~]
7、关闭 swap
所有节点执行
[root@k8s-master ~]
[root@k8s-master ~]
8、时间同步
所有节点执行
[root@k8s-master ~]
26 Apr 19:58:05 ntpdate[13947]: the NTP socket is in use, exiting
[root@k8s-master ~]
9、安装 Docker
所有节点执行
过程略,需要 Docker 快速安装脚本的可私我。
10、安装 kubeadm、kubelet
所有节点执行
-
添加 k8s 镜像源
地址:https://developer.aliyun.com/mirror/kubernetes?spm=a2c6h.13651102.0.0.1cd01b116JYQIn
[root@k8s-master ~]
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
-
建立 k8s YUM 缓存 [root@k8s-master ~]
-
安装 k8s 相关工具
[root@k8s-master ~]
...
...
kubelet.x86_64 1.23.0-0 kubernetes
kubelet.x86_64 1.23.1-0 kubernetes
kubelet.x86_64 1.23.2-0 kubernetes
kubelet.x86_64 1.23.3-0 kubernetes
kubelet.x86_64 1.23.4-0 kubernetes
kubelet.x86_64 1.23.5-0 kubernetes
kubelet.x86_64 1.23.6-0 kubernetes
[root@k8s-master ~]
[root@k8s-master ~]
二、Master 节点
1、k8s 初始化
Master 节点执行
[root@k8s-master ~]
--apiserver-advertise-address=172.27.0.13 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.6 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--ignore-preflight-errors=all
参数说明:
--apiserver-advertise-address
--image-repository
--kubernetes-version
--service-cidr
--pod-network-cidr
初始化后输出内容:
...
...
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.27.0.13:6443 --token hgtxra.fccj35x2szia3r3c \
--discovery-token-ca-cert-hash sha256:578cf5ca4cf588e3d84005d06f6503bf5d9ee25f63b0cfab4f78677a24b92bdd
2、根据输出提示创建相关文件
Master 节点执行
[root@k8s-master ~]
[root@k8s-master ~]
[root@k8s-master ~]
3、查看 k8s 运行的容器
Master 节点执行
[root@k8s-master ~]
NAME READY STATUS RESTARTS AGE
coredns-6d8c4cb4d-85dx9 0/1 Pending 0 53m
coredns-6d8c4cb4d-f7wld 0/1 Pending 0 53m
etcd-k8s-master 1/1 Running 1 53m
kube-apiserver-k8s-master 1/1 Running 1 53m
kube-controller-manager-k8s-master 1/1 Running 1 53m
kube-proxy-5mpdp 1/1 Running 0 13m
kube-proxy-9lp29 1/1 Running 0 12m
kube-proxy-9ttf6 1/1 Running 0 53m
kube-scheduler-k8s-master 1/1 Running 1 53m
4、查看 k8s 节点
Master 节点执行
[root@k8s-master ~]
NAME STATUS ROLES AGE VERSION
k8s-master NotReady control-plane,master 8m9s v1.23.6
可看到当前只有 k8s-master 节点,而且状态是 NotReady(未就绪),因为我们还没有部署网络插件(kubectl apply -f [podnetwork].yaml ),于是接着部署容器网络(CNI)。
5、容器网络(CNI)部署
Master 节点执行 插件地址:https://kubernetes.io/docs/concepts/cluster-administration/addons/ 该地址在 k8s-master 初始化成功时打印出来。
-
选择一个主流的容器网络插件部署(Calico) -
下载yml文件 wget https://docs.projectcalico.org/manifests/calico.yaml
-
根据初始化的输出提示执行启动指令 [root@k8s-master ~]
...
...
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
Warning: policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
poddisruptionbudget.policy/calico-kube-controllers created
-
看看该yaml文件所需要启动的容器 [root@k8s-master ~]
image: docker.io/calico/cni:v3.22.2
image: docker.io/calico/cni:v3.22.2
image: docker.io/calico/pod2daemon-flexvol:v3.22.2
image: docker.io/calico/node:v3.22.2
image: docker.io/calico/kube-controllers:v3.22.2
-
查看容器是否都 Running [root@k8s-master ~]
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7c845d499-rh7tb 1/1 Running 0 5m27s
calico-node-fpdjb 1/1 Running 0 5m28s
calico-node-jsdf4 1/1 Running 0 5m28s
calico-node-kmpnr 1/1 Running 0 5m28s
coredns-6d8c4cb4d-85dx9 1/1 Running 0 98m
coredns-6d8c4cb4d-f7wld 1/1 Running 0 98m
etcd-k8s-master 1/1 Running 1 99m
kube-apiserver-k8s-master 1/1 Running 1 99m
kube-controller-manager-k8s-master 1/1 Running 1 99m
kube-proxy-5mpdp 1/1 Running 0 58m
kube-proxy-9lp29 1/1 Running 0 58m
kube-proxy-9ttf6 1/1 Running 0 98m
kube-scheduler-k8s-master 1/1 Running 1 99m
三、work 节点
1、work 节点加入 k8s 集群
所有 work 节点执行
[root@k8s-work1 ~]
[root@k8s-work2 ~]
2、查询集群节点
Master 节点执行
[root@k8s-master ~]
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane,master 99m v1.23.6
k8s-work1 Ready <none> 59m v1.23.6
k8s-work2 Ready <none> 58m v1.23.6
都为就绪状态了
四、验证
k8s 集群部署 nginx 服务,并通过浏览器进行访问验证。
1、创建 pod
kubectl create deployment nginx --image=nginx
kubectl expose deployment nginx --port=80 --type=NodePort
kubectl get pod,svc
2、访问 Nginx
至此:kubeadm方式的k8s集群已经部署完成。
FAQ
1、k8s编译报错
...
...
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
查看日志
Apr 26 20:33:30 test3 kubelet: I0426 20:33:30.588349 21936 docker_service.go:264] "Docker Info" dockerInfo=&{ID:2NSH:KJPQ:XOKI:5XHN:ULL3:L4LG:SXA4:PR6J:DITW:HHCF:2RKL:U2NJ Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:7 Driver:overlay2 DriverStatus:[[Backing Filesystem extfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true KernelMemoryTCP:false CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true PidsLimit:false IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:24 OomKillDisable:true NGoroutines:45 SystemTime:2022-04-26T20:33:30.583063427+08:00 LoggingDriver:json-file CgroupDriver:cgroupfs CgroupVersion: NEventsListener:0 KernelVersion:3.10.0-1160.59.1.el7.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSVersion: OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc000263340 NCPU:2 MemTotal:3873665024 GenericResources:[] DockerRootDir:/var/lib/docker HTTPProxy: HTTPSProxy: NoProxy: Name:k8s-master Labels:[] ExperimentalBuild:false ServerVersion:18.06.3-ce ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:docker-runc Args:[] Shim:<nil>}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil> Warnings:[]} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:468a545b9edcd5932818eb9de8e72413e616e86e Expected:468a545b9edcd5932818eb9de8e72413e616e86e} RuncCommit:{ID:a592beb5bc4c4092b1b1bac971afed27687340c5 Expected:a592beb5bc4c4092b1b1bac971afed27687340c5} InitCommit:{ID:fec3683 Expected:fec3683} SecurityOptions:[name=seccomp,profile=default] ProductLicense: DefaultAddressPools:[] Warnings:[]}
Apr 26 20:33:30 test3 kubelet: E0426 20:33:30.588383 21936 server.go:302] "Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
看报错的最后解释kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\"" 很明显 kubelet 与 Docker 的 cgroup 驱动程序不同,kubelet 为 systemd,而 Docker 为 cgroupfs。
简单查看一下docker驱动:
[root@k8s-master opt]
Cgroup Driver: cgroupfs
解决方案 重置初始化操作并删除相关文件,然后再修改 Docker 的 cgroup 驱动程序为 systemd 即可
[root@k8s-master ~]
[root@k8s-master ~]
[root@k8s-master opt]
{
"registry-mirrors": ["https://q1rw9tzz.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
[root@k8s-master opt]
[root@k8s-master opt]
[root@k8s-master ~]
2、work 节点加入 k8s 集群报错
报错1:
accepts at most 1 arg(s), received 3
To see the stack trace of this error execute with --v=5 or higher
原因:命令不对,我是直接复制粘贴 k8s-master 初始化的终端输出结果,导致报错,所以最好先复制到 txt 文本下修改好格式再粘贴执行。 查看日志
报错2:
[root@k8s-work1 ~]
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-10250]: Port 10250 is in use
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决方案
根据提示查看被占端口
[root@k8s-work1 ~]
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 17616/kubelet
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1152/sshd
tcp 0 0 127.0.0.1:36281 0.0.0.0:* LISTEN 17616/kubelet
tcp6 0 0 :::10250 :::* LISTEN 17616/kubelet
tcp6 0 0 :::10255 :::* LISTEN 17616/kubelet
原因:10250端口被占用了,kill 掉然后再次 join 即可
[root@k8s-work1 ~]
[root@k8s-work1 ~]
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1152/sshd
<点击跳转至开头>
|