本文主要介绍k8s的上手。k8s的易用度做的不够好,大版本的变化都会带来教程的失效。这篇文章结合我们过去的经验和实践重新进行了梳理。
同时,基于k8s的1.28版本部署。
参考如下:
k8s官网:https://kubernetes.io/docs/home/
极客时间:https://time.geekbang.org/column/article/39712
阿里云的镜像:https://developer.aliyun.com/mirror/kubernetes
另外,k8s的大量镜像需要从k8s、docker-huv、ghcr上拉取,你最好配置好自己的nexus
,一劳永逸解决,配置方法参见nexus配置的一些实践
1.安装
参考阿里云的镜像:https://developer.aliyun.com/mirror/kubernetes
接下来的内容主要基于rocky linux9
,基本和alma linux 9
、centos 9
、redhat 9
一致,其他操作系统请参考阿里云的说明。
1.1.更新操作系统
1 2 3 4 5 6 7 8 9 $ dnf install -y https://mirrors.aliyun.com/epel/epel-release-latest-9.noarch.rpm; $ sed -i 's|^#baseurl=https://download.example/pub|baseurl=https://mirrors.aliyun.com|' /etc/yum.repos.d/epel*; $ sed -i 's|^metalink|#metalink|' /etc/yum.repos.d/epel*; $ dnf remove -y docker-ce docker-ce-cli containerd.io cri-o kubelet kubeadm kubectl cri-tools kubernetes-cni;rm -rf /etc/yum.repos.d/docker-ce.repo $ dnf clean all;dnf makecache -y;dnf update -y;dnf groupinstall -y 'development tools' ;dnf install -y go nfs-utils;
另外,你要保障/etc/hosts里正确了配置机器名和ip的映射。
1.2.准备工作
你最好有自己的nexus对docker仓库进行代理,我们这里假设你的docker私服是:docker.test.com
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 $ swapoff -a; $ /usr/bin/crb enable ;setenforce 0; $ systemctl stop firewalld.service;systemctl disable firewalld.service;sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config; $ tee /etc/modules-load.d/k8s.conf <<-'EOF' overlay br_netfilter EOF $ modprobe overlay;modprobe br_netfilter; $ tee /etc/sysctl.d/k8s.conf <<-'EOF' net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF $ sysctl -p /etc/sysctl.d/k8s.conf;sysctl --system;
1.2.1.容器运行时:cri-o
使用cri-o
,参考官方说明:https://github.com/cri-o/cri-o/blob/main/install.md#installation-instructions
关于容器运行时的选择,参考官方说明:https://v1-28.docs.kubernetes.io/docs/setup/production-environment/container-runtimes
1 2 3 4 5 6 7 8 9 10 $ curl -L -o /etc/yum.repos.d/devel:kubic:libcontainers:stable.repo https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/CentOS_9_Stream/devel:kubic:libcontainers:stable.repo $ curl -L -o /etc/yum.repos.d/devel:kubic:libcontainers:stable:cri-o:1.28.2.repo https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable:cri-o:1.28:1.28.2/CentOS_9_Stream/devel:kubic:libcontainers:stable:cri-o:1.28:1.28.2.repo $ dnf install -y cri-o; $ vim /etc/crio/crio.conf $ vim /etc/containers/registries.conf $ systemctl enable crio && systemctl start crio;
不使用containerd和docker,常用命令和docker基本一致
1 2 3 4 $ crictl ps -a $ crictl pull xxxxxx
1.2.2.【推荐】容器运行时:containerd
参考官方说明:https://github.com/containerd/containerd/blob/main/docs/getting-started.md 。
主要不要动过package manager安装,否则可能会安装到docker shim。
1 2 3 4 5 6 7 8 9 10 11 12 $ curl "https://github.com/containerd/containerd/releases/download/v1.7.15/containerd-1.7.15-linux-amd64.tar.gz" -O $ curl "https://github.com/containernetworking/plugins/releases/download/v1.4.1/cni-plugins-linux-amd64-v1.4.1.tgz" -O $ curl "https://github.com/opencontainers/runc/releases/download/v1.1.12/runc.amd64" -O $ tar Cxzvf /usr/local/ ./containerd-1.7.15-linux-amd64.tar.gz $ install -m 755 runc.amd64 /usr/local/sbin/runc $ mkdir -p /opt/cni/bin;tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.4.1.tgz
将containerd添加到启动服务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 $ vim /etc/systemd/system/containerd.service [Unit] Description=containerd container runtime Documentation=https://containerd.io After=network.target local-fs.target [Service] ExecStartPre=-/sbin/modprobe overlay ExecStart=/usr/local/bin/containerd Type=notify Delegate=yes KillMode=process Restart=always RestartSec=5 LimitNPROC=infinity LimitCORE=infinity TasksMax=infinity OOMScoreAdjust=-999 [Install] WantedBy=multi-user.target $ systemctl daemon-reload;systemctl enable --now containerd
cgroup切换为systemd
1 2 $ mkdir /etc/containerd/;containerd config default > /etc/containerd/config.toml;sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml;sed -i 's/registry.k8s.io\/pause:3.8/docker.test.com\/pause:3.9/g' /etc/containerd/config.toml; $ systemctl restart containerd;
1.3.正式安装
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 $ tee /etc/yum.repos.d/kubernetes.repo <<-'EOF' [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/rpm/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.28/rpm/repodata/repomd.xml.key exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni EOF $ dnf install -y --disableexcludes=kubernetes kubeadm kubelet kubectl ; $ crictl config runtime-endpoint unix:///var/run/containerd/containerd.sock;crictl config image-endpoint unix:///var/run/containerd/containerd.sock; $ systemctl enable --now kubelet;
2.初始化
你最好有自己的nexus对docker仓库进行代理,我们这里假设你的docker私服是:docker.test.com
2.1.需要了解的内容
查看初始化需要的镜像
1 2 3 4 5 6 7 8 $ kubeadm config images list $ kubeadm config images pull $ kubeadm config images list --config kubeadm.yaml $ kubeadm config images pull --config kubeadm.yaml
另外,有问题就重置
1 $ kubeadm reset -f;rm -rf /etc/cni /var/lib/cni/ /etc/kubernetes /var/lib/dockershim /var/lib/etcd /var/lib/kubelet /var/run/kubernetes ~/.kube/*;mkdir -p /etc/cni/net.d/;
参考:https://stackoverflow.com/questions/44698283/how-to-completely-uninstall-kubernetes
2.2.初始化master节点
1 kubeadm init --control-plane-endpoint 10.0.1.170:6443 --apiserver-advertise-address=10.0.1.170 --image-repository=docker.test.com --pod-network-cidr=10.244.0.0/16 --v=5
需要注意:
control-plane-endpoint和apiserver-advertise-address都使用本机ip
,注意,/etc/hosts
需要设置好机器名和本机ip映射
pod-network-cidr是网络地址,默认为flannel网络插件的ip,不要修改
imageRepository是一定要设置的,k8s需要从repo(默认是:https://registry.k8s.io )拉取镜像,你可以在自己的nexus设置https://registry.k8s.io 的代理,然后这里填你自己的nexus地址
v=5是日志级别
2.3.启动后续
2.3.1.证书
集群默认证书是1年,检查证书有效期:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 $ kubeadm certs check-expiration CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Feb 18, 2025 07:25 UTC 364d ca no apiserver Feb 18, 2025 07:25 UTC 364d ca no apiserver-etcd-client Feb 18, 2025 07:25 UTC 364d etcd-ca no apiserver-kubelet-client Feb 18, 2025 07:25 UTC 364d ca no controller-manager.conf Feb 18, 2025 07:25 UTC 364d ca no etcd-healthcheck-client Feb 18, 2025 07:25 UTC 364d etcd-ca no etcd-peer Feb 18, 2025 07:25 UTC 364d etcd-ca no etcd-server Feb 18, 2025 07:25 UTC 364d etcd-ca no front-proxy-client Feb 18, 2025 07:25 UTC 364d front-proxy-ca no scheduler.conf Feb 18, 2025 07:25 UTC 364d ca no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Feb 16, 2034 07:06 UTC 9y no etcd-ca Feb 16, 2034 07:06 UTC 9y no front-proxy-ca Feb 16, 2034 07:06 UTC 9y no
虽然可以通过kubeadm certs renew all进行证书更新,但是每年需要操作。
可以通过如下脚本一次生成9年的证书,脚本参见:https://github.com/yuyicai/update-kube-cert/blob/master/update-kubeadm-cert.sh
用法:
1 chmod 755 ./update-kubeadm-cert.sh;./update-kubeadm-cert.sh all --cri containerd
会自动签发证书并重启kube-apiserver、kube-controller-manager、kube-scheduler、etcd
2.3.1.添加集群配置
启动时的日志可以看到如下提示,请逐一执行
1 2 3 4 5 6 7 Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
添加环境变量
1 2 3 Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf
下面是告诉你如何添加别的节点,请保留,后续在新的cluster上执行这句话
1 2 3 Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.1.11:6443 --token ennal9.tobyb8of2c8yeppx --discovery-token-ca-cert-hash sha256:cde119d31cbda65b693cd84cee70580764f09714e43e228d05d4e6cc1b50c8b1
提示你部署网络,这个稍后再处理
1 2 3 You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/
2.3.2.安装网络插件-flannel
参考:https://gist.github.com/rkaramandi/44c7cea91501e735ea99e356e9ae7883#configure-kubernetes-master
1 2 3 4 5 $ curl https://github.com/coreos/flannel/raw/master/Documentation/kube-flannel.yml -O $ sed -i 's/docker.io/docker.test.com/g' kube-flannel.yml $ kubectl apply -f ./kube-flannel.yml
2.3.3.安装metric
1 2 3 4 5 6 7 8 9 10 11 12 $ curl https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metrics.yml $ sed -i 's/registry.k8s.io/docker.test.com/g' ./metrics.yml spec: containers: - args: - --cert-dir=/tmp - --secure-port=10250 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname - --kubelet-use-node-status-port - --metric-resolution=15s - --kubelet-insecure-tls
检查效果
1 2 3 4 5 6 7 8 9 $ kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% k8s-master 232m 5% 1708Mi 46% k8s-slave1 29m 1% 594Mi 34% k8s-slave2 25m 1% 556Mi 32% $ kubectl top pod
2.3.4.查看集群
全部是running状态,准备就绪
1 2 3 4 5 6 7 8 9 10 $ kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION centos7 Ready control-plane,master 53s v1.22.4 $ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-59c77d78dd-ghjnk 1/1 Running 0 52s kube-system coredns-59c77d78dd-sb2lc 1/1 Running 0 52s $ kubectl describe pod coredns-59c77d78dd-ghjnk -n kube-system
3.节点管理
3.1.查看
1 2 3 4 5 6 7 8 $ kubectl get nodes --show-labels $ kubectl label nodes test-node-03 deploy-type=dynamic-node $ kubectl get pods --all-namespaces $ kubectl describe pod coredns-59c77d78dd-fh47c -n kube-system
3.2.新增节点
1 2 $ kubeadm token create –print-join-command $ kubeadm join 192.168.1.11:6443 --token ennal9.tobyb8of2c8yeppx --discovery-token-ca-cert-hash sha256:cde119d31cbda65b693cd84cee70580764f09714e43e228d05d4e6cc1b50c8b1
4.DashBoard
4.1.安装
官网:https://github.com/kubernetes/dashboard
这里主要通过ingress-nginx暴露
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 $ wget -e "https_proxy=http://你的代理" -O dashboard.yml https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml $ sed -i 's/image: /image: docker.test.com\//g' dashboard.yml spec: ports: - port: 443 targetPort: 8443 name: https - port: 80 targetPort: 9090 name: http livenessProbe: httpGet: scheme: HTTP path: / port: 9090 ports: - containerPort: 8443 protocol: TCP - containerPort: 9090 protocol: TCP args: kind: ClusterRoleBinding metadata: name: kubernetes-dashboard roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin $ kubectl apply -f dashboard.yml
添加一个ingress代理:ingress-dashboard.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ingress-dashboard namespace: kubernetes-dashboard spec: ingressClassName: nginx rules: - host: "k8s-dashboard.dev.com" http: paths: - path: / pathType: Prefix backend: service: name: kubernetes-dashboard port: number: 9090
最后kubectl apply -f ingress-dashboard.yml
,即可通过k8s-dashboard.dev.com:30080
访问看板。
注意这里做好域名的解析,另外,30080是ingress暴露的默认http端口,这里可以根据你的实际ingress端口来配置。
4.2.登录
权限介绍参见官网:https://github.com/kubernetes/dashboard#create-an-authentication-token-rbac
这里的权限介绍,也参考了:https://kuboard.cn/install/install-k8s-dashboard.html#访问
创建一个auth.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 apiVersion: v1 kind: ServiceAccount metadata: name: admin-user namespace: kubernetes-dashboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: admin-user roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: admin-user namespace: kubernetes-dashboard
执行
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 $ kubectl apply -f ./auth.yml $ kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}' ) Name: admin-user-token-p25dh Namespace: kubernetes-dashboard Labels: <none> Annotations: kubernetes.io/service-account.name: admin-user kubernetes.io/service-account.uid: 8b6ed1d6-05d4-44e5-b6be-1403f8e86d41 Type: kubernetes.io/service-account-token Data ==== namespace: 20 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6IndSZjhsaGZKeENYcmZrTlRnd29zbkFrdHNRdGVpYUJtQmFyYzdjajNqTWcifQ.eyJpc3MiOiJrdWJlcm5ldGdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZXJuZXRlcy1kYXNoYm9hcmQ6YWRtaW4tdXNlciJ9.1_7Zy7qHb26YhntIqQNBeAorxuu7hxEGXFBZUN4CjdZrJCKTdESt2QLR3BWh0EIoQHCLmCmqoRZQO-ti4BCsV1Gb_oC25iLyTW817HzGeUcfPRkmIc2KPYZrGZGj6Sp_zYKUgAxSAVkn4VsLSDkIaCW6n3yCfuzGM477qs4W4ziPWvFsSdzUbQy42cNcNuAv9YRqUQU7V5lOHw7ry6ort-X48De2fX1Z2_ZrJbIoeeH-c7V50le_Czy97gDCvysKsgQ3EqlZGgFZVIU5pC-ghM3YH99FGaL7avAyFnXkks6zQSaoH4Kbf_8qOWQ9uoS_N97AUp8VtByW6bcQwloT8w ca.crt: 1099 bytes
5.StorageClass
推荐nfs做为默认存储。参考:https://cloud.tencent.com/developer/article/2365976
准备一台普通服务器。
1 2 3 4 5 $ dnf install nfs-utils -y ; $ mkdir -p /data/k8s; $ vim /etc/exports; /data/k8s *(rw,no_root_squash) $ systemctl enable nfs-server --now
k8s集成:
helm官方github:https://github.com/helm/helm
nfs官方仓库:https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 $ helm repo add nfs-subdir-external-provisioner https://你的nexus/repository/helm-nfs/; $ kubectl create ns nfs-system; $ helm pull nfs-subdir-external-provisioner/nfs-subdir-external-provisioner; $ tar xvf nfs-subdir-external-provisioner-4.0.18.tgz; $ vim nfs-subdir-external-provisioner/values.yaml; image: repository: registry.k8s.io/sig-storage/nfs-subdir-external-provisioner tag: v4.0.2 nfs: server: 192.168.9.81 path: /data/k8s storageClass: defaultClass: true name: nfs-sc $ helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner -f nfs-subdir-external-provisioner/values.yaml -n nfs-system $ vim zadig-minio-pv.yml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: zadig-minio-pv spec: storageClassName: nfs-sc accessModes: - ReadWriteMany resources: requests: storage: 40Gi $ kubectl apply -f zadig-minio-pv.yml -n nfs-system
检查结果
1 2 3 4 5 6 kubectl get storageclass kubectl get sc nfs-sc -o wide; kubectl get deployment -n nfs-system -o wide; kubectl get pod -n nfs-system -o wide; kubectl get pvc -n nfs-system -o wide;
6.coredns
如果使用私有dns,coredns将无法正确转发到私有dns:
1 2 $ kubectl -n kube-system edit cm coredns
7.部署
7.1.镜像
1 2 3 4 5 6 7 8 9 10 11 12 13 $ kubectl create namespace java-qa $ kubectl create secret generic my-pull-secret -n java-qa --from-literal=username=my_account --from-literal=password=my_password $ kubectl get pods -A $ kubectl get pod -n java-qa -o wide $ kubectl get deploy -n java-qa -o wide $ kubectl describe pod coredns-59c77d78dd-fh47c -n java-qa