有关基础查看、排查和操作的命令,除web ui以外,完全可以通过kubectl命令。 可以通过kubectl –help 进行,或者自命令 比如kubectl create –help查看,命令操作基本还是要多用多看帮助就熟练了。
这里记录一下常用的一些查看和操作的命令,部分做了中文翻译,不过还是建议看英文。
Basic Commands (Beginner):
create 从文件或stdin创建资源
expose 为deployment,pod创建Service。
run Run a particular image on the cluster 老版本在用,未来会被弃用,官方建议使用create 参数较多,可以考了 kubectl run –help查看
set 更新resource ,比如更新env环境变量,image,resources 资源限制,selector subject等。
Basic Commands (Intermediate):
get 最基本的对象查询命令。如 kubectl get nodes/pods/deploy/rs/ns/secret等等, 加-o wide查看详细信息,-o yaml 或-o json 输出具体格式。
explain 查看资源定义(文档)。如 kubectl explain replicaset
edit 使用系统编辑器编辑资源,完成对象的更新。如 kubectl edit deploy/foo
delete 删除指定资源,支持文件名、资源名、label selector。
Deploy Commands:
rollout Deployment, Daemonset的升级过程管理(查看状态status、操作历史history、暂停升级、恢复升级、回滚等)
scale 修改Deployment, ReplicaSet, ReplicationController, Job的实例数,实现一个副本集的手工扩展。
autoscale 为Deploy, RS, RC配置自动伸缩规则(依赖heapster和hpa)
Cluster Management Commands:
certificate Modify certificate resources.
cluster-info 查看集群信息
top 查看资源占用率(依赖heapster)
cordon 标记节点为unschedulable
uncordon 标记节点为schedulable
drain 驱逐节点上的应用,准备下线维护
taint 修改节点taint标记
Troubleshooting and Debugging Commands(故障排查和调试命令):
describe 查看资源详情
logs 查看pod内容器的日志
attach Attach到pod内的一个容器
exec 在指定容器内执行命令
port-forward 为pod创建本地端口映射
proxy 为Kubernetes API server创建代理
cp 容器内外/容器间文件拷贝
auth Inspect authorization
Advanced Commands:
apply 从文件或stdin创建/更新资源
patch 使用strategic merge patch语法更新对象的某些字段
replace 从文件或stdin更新资源
convert 在不同API版本之间转换对象定义
Settings Commands:
label 给资源设置label
annotate 给资源设置annotation
completion 获取shell自动补全脚本(支持bash和zsh)
Other Commands:
api-versions Print the supported API versions on the server, in the form of “group/version”
api-resources Print the supported API resources on the server
config 修改kubectl配置(kubeconfig文件),如context
help Help about any command
version 查看客户端和Server端K8S版本
一,kubectl实用技巧(网上找到的)
1,查看资源缩写
kubectl describe 或者 kubectl api-resources
2,配置kubectl自动完成
source < (kubectl completion bash)
3,kubectl写yaml太累,找样例太麻烦?
用run命令生成
kubectl run --image=nginx my-deploy -o yaml --dry-run > my-deploy.yaml
4,用get命令导出
kubectl get statefulset/foo -o=yaml --export > new.yaml
二,常用查看和故障排查命令
1,检查集群是否正常
[root@master01 ~]# kubectl get cs NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-1 Healthy {"health": "true"} etcd-0 Healthy {"health": "true"} etcd-2 Healthy {"health": "true"}
如果某一个不正常,请查看对应日志。比如etcd集群时不时提示unhealthy,请查看etcd节点日志信息,假如系统负载过高,会导致心跳检测失败。
2,检查master状态是否正常
[root@master01 ~] kubectl cluster-info Kubernetes master is running at https://10.1.14.21:6443 Heapster is running at https://10.1.14.25:6443/api/v1/namespaces/kube-system/services/heapster/proxy KubeDNS is running at https://10.1.14.25:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy monitoring-grafana is running at https://10.1.14.21:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy monitoring-influxdb is running at https://192.168.20.134:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb/proxy
3,通过kubectl run 创建一组pods nginx,这个命令会启动创建deploy 以及rs 。 这里以nginx1.10为例进行演示
kubectl run nginx --image=nginx:1.10 --port=80 --labels="app=nginx1.10" --replicas=2
查看执行结果:
[root@master01 yaml]# kubectl get pods NAME READY STATUS RESTARTS AGE nginx-64f9d8b667-grtql 0/1 ContainerCreating 0 50s nginx-64f9d8b667-vbhkh 0/1 ContainerCreating 0 49s
这里一般会配合dscribe 命令查看pods创建状态,可以看到每个pod分配到哪个node以及进度;如果失败,会记录失败原因。
4,kubectl get
查看resource资源,比如常用的nodes/pods/replicasets/services/endpoints/deployments/namespaces等等
使用kubectl get xxx ,如果要查看详细输出,后边可以加-o wide参数; -o yaml 或-o json 输出具体格式。 。
[root@master01 ~]# kubectl get pods -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE kube-dns-79fbb66f55-5xvlq 3/3 Running 3 21h 172.50.70.5 10.1.14.25
5,kubectl describe xxx
支持多数resources,比如node/pod/ns/deployment/rs/rc/svc等等
获取resource的详细信息,一般用于资源的详细参数,pods 无法正常启动的排查,查看报错日志,比如images pull失败,加载参数报错等等。
这里简单记录下刚才创建pod:
[root@master01 yaml]# kubectl describe pods nginx-64f9d8b667-grtql Name: nginx-64f9d8b667-grtql Namespace: default Node: 10.1.14.26/10.1.14.26 Start Time: Wed, 28 Nov 2017 23:20:18 -0500 Labels: app=nginx1.10 pod-template-hash=64f9d8b667 Annotations: Status: Pending IP: Controlled By: ReplicaSet/nginx-64f9d8b667 Containers: nginx: Container ID: Image: nginx:1.10 Image ID: Port: 80/TCP Host Port: 0/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-bzjbz (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: default-token-bzjbz: Type: Secret (a volume populated by a Secret) SecretName: default-token-bzjbz Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 17s default-scheduler Successfully assigned default/nginx-64f9d8b667-grtql to 10.1.14.26 Normal Pulling 11s kubelet, 10.1.14.26 pulling image "nginx:1.10"
可以看到pod nginx-64f9d8b667-grtql被kube-scheduler分配到了node 10.1.14.26节点,目前正在pull nginx:1.10的镜像。如完成会提示创建成功和启动成功。
6,在pod或容器中执行命令 kubectl exec用法
执行Pod的data命令,默认是用Pod中的第一个容器执行
kubectl exec pods名称 command
指定Pod中某个容器执行data命令
kubectl exec pods名称 command
比如:
[root@master01 yaml]# kubectl exec nginx-64f9d8b667-grtql cat /etc/hosts # Kubernetes-managed hosts file. 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet fe00::0 ip6-mcastprefix fe00::1 ip6-allnodes fe00::2 ip6-allrouters 172.50.75.2 nginx-64f9d8b667-grtql
通过bash获得Pod中某个容器的TTY,相当于登录容器
kubectl exec -it pods名称 bash
例子如下:
[root@master01 yaml]# kubectl exec -ti nginx-64f9d8b667-grtql bash root@nginx-64f9d8b667-grtql:/# cat /etc/hosts # Kubernetes-managed hosts file. 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet fe00::0 ip6-mcastprefix fe00::1 ip6-allnodes fe00::2 ip6-allrouters 172.50.75.2 nginx-64f9d8b667-grtql
7,kubectl logs
使用kubectl logs能够取出pod中镜像的log,也是故障排除时候的重要信息
[root@master01 yaml]# kubectl logs nginx-dbddb74b8-7pwk2 10.254.143.54 - - [28/Nov/2017:03:19:43 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.29.0" "-" 172.50.75.1 - - [28/Nov/2017:03:23:12 +0000] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36" "-" 2017/11/28 03:23:13 [error] 7#7: *3 open() "/usr/share/nginx/html/favicon.ico" failed (2: No such file or directory), client: 172.50.75.1, server: localhost, request: "GET /favicon.ico HTTP/1.1", host: "10.1.14.26:30765", referrer: "http://10.1.14.26:30765/"
8,kubectl expose 创建svc进行端口报漏,方便外部访问。
kubectl expose deploy nginx --port=80 --target-port=80 --type=NodePort --name=nginx-service
查看svc:
[root@master01 yaml]# kubectl get svc -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR kubernetes ClusterIP 10.254.0.1 443/TCP 16d nginx-service NodePort 10.254.218.227 80:30680/TCP 119s app=nginx1.10 [root@master01 yaml]# kubectl describe svc nginx-service Name: nginx-service Namespace: default Labels: app=nginx1.10 Annotations: Selector: app=nginx1.10 Type: NodePort IP: 10.254.218.227 Port: 80/TCP TargetPort: 80/TCP NodePort: 30680/TCP Endpoints: 172.50.70.4:80,172.50.75.2:80 Session Affinity: None External Traffic Policy: Cluster Events:
访问测试效果:
[root@node01 ~]# curl -I 10.254.218.227:80 HTTP/1.1 200 OK Server: nginx/1.10.3 Date: Thu, 29 Nov 2017 04:40:48 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 31 Jan 2017 15:01:11 GMT Connection: keep-alive ETag: "5890a6b7-264" Accept-Ranges: bytes [root@node01 ~]# curl -I 10.1.14.25:30680 HTTP/1.1 200 OK Server: nginx/1.10.3 Date: Thu, 29 Nov 2017 04:41:30 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Tue, 31 Jan 2017 15:01:11 GMT Connection: keep-alive ETag: "5890a6b7-264" Accept-Ranges: bytes
9,kubectl set或者 kubectl edit 进行pod模版的版本更新,大到环境image更新,小到pod模版的env 设置,request resource资源限制,selector设置,sa 或者subject修改。
比如,我们更新image ,由之前nginx=nginx:1.10更新到nginx:1.11,分别采用kubectl set 或者kubectl edit测试
(1)kubectl set ,具体用法可以kubectl set –help了解:
kubectl set image deploy/nginx nginx=nginx:1.11
重新访问,发现nginx的响应头信息已经变成nginx1.11
(2)kubecctl edit 编辑deploy
kubectl edit deploy/nginx
进去直接编辑image的镜像版本信息,保存后保存。依然可以实现指定pod image的更新。
10,kubectl rollout 查看版本发布情况以及历史版本信息 以及回滚操作
(1)kubectl rollout status 查看版本发布情况:
[root@master01 ~]# kubectl rollout status deploy/nginx Waiting for deployment "nginx" rollout to finish: 1 out of 2 new replicas have been updated... Waiting for deployment "nginx" rollout to finish: 1 out of 2 new replicas have been updated... Waiting for deployment "nginx" rollout to finish: 1 out of 2 new replicas have been updated... Waiting for deployment "nginx" rollout to finish: 1 old replicas are pending termination... Waiting for deployment "nginx" rollout to finish: 1 old replicas are pending termination... deployment "nginx" successfully rolled out
(2)kubectl rollout history 查看版本发布历史,加–revision=x(x为版本号)可以查看具体细节
[root@master01 ~]# kubectl rollout history deploy/nginx deployment.extensions/nginx REVISION CHANGE-CAUSE 2 3 4 kubectl set image deploy/nginx nginx=nginx:1.13 --record=true [root@master01 ~]# kubectl rollout history deploy/nginx --revision=2 deployment.extensions/nginx with revision #2 Pod Template: Labels: app=nginx1.10 pod-template-hash=d99665758 Containers: nginx: Image: nginx:1.11 Port: 80/TCP Host Port: 0/TCP Environment: Mounts: Volumes:
(3)kubectl rollout undo 撤回上一个版本,–to-revision=x回退到指定版本
kubectl rollout undo deployment nginx #回滚到上一个版本 kubectl rollout undo deployment nginx --to-revision=2 #回滚到指定版本
(4)另外kubectl rollout pause/resume 暂停和恢复升级这里不与演示,操作差不多
11,kubectl scale 扩展deploy ,适用于手工扩展deploy情况。
[root@master01 ~]# kubectl scale deploy nginx --replicas=4 deployment.extensions/nginx scaled [root@master01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE nginx-d99665758-5pm47 0/1 Pending 0 1s 10.1.14.25 nginx-d99665758-qqhhx 1/1 Running 0 2m42s 172.50.51.3 10.1.14.26 nginx-d99665758-tppvb 0/1 Pending 0 2s 10.1.14.25 nginx-d99665758-z5dhz 1/1 Running 0 2m51s 172.50.51.2 10.1.14.26
转载请注明:21运维 » kubenetes 集群常用查看和故障排查命令(持续更新)