在Kubernetes中,coredns作为一个服务发现的配置中心,K8S中创建的 Service 和 Pod 都会在其中自动生成相应的 DNS 记录,能够用作集群内部的dns解析,作为一个重要组件,coredns一般都采用daemonSet方式部署。
本文记录了几种场景下coredns异常问题。
1. 服务器重启后coredns异常
服务器重启后,K8S集群内coredns状态为CrashLoopBackOff,coredns版本:1.8.3,查看coredns日志如下
[root@reg ~]# kubectl logs -f coredns-bbckw -n kube-system
plugin/forward: no nameservers found
查看/etc/resolv.conf的dns配置,配置里一个 nameserver 都没有
[root@reg ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search localhost
会导致 coredns 无法起来刷日志,查看configmap 的 coredns 的配置里有如下配置
forward . /etc/resolv.conf {
max_concurrent 1000
}
coredns的/etc/resolv.conf是挂载的宿主机的,会读取文件里的
nameserver配置,所有coredns无法启动的原因找到了,临时解决方式就是重新配置/etc/resolv.conf配置
为了防止下次重启再次丢失,彻底解决方式如下
编辑该文件,在[main]下面加入dns=none
vi /etc/NetworkManager/NetworkManager.conf
[main]
dns=none
2. 服务器重启后coredns异常
场景:/etc/resolv.conf文件丢失
[root@02-3 ~]# kubectl get pods -owide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-5mftn 1/1 Running 2 174d 172.27.0.214 10.10.28.10 <none> <none>
coredns-9bnl8 1/1 Running 2 62d 172.27.4.31 10.10.28.141 <none> <none>
coredns-9tmcj 0/1 ContainerCreating 0 105s <none> 10.10.28.116 <none> <none>
查看pod的日志
[root@02-3 ~]# kubectl describe pods coredns-9tmcj -n kube-system
........
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m11s default-scheduler Successfully assigned kube-system/coredns-9tmcj to 10.10.28.116
Warning FailedCreatePodSandBox 1s (x11 over 2m11s) kubelet Failed to create pod sandbox: open /etc/resolv.conf: no such file or directory
检查dns的配置文件
[root@02-3 ~]# ll /etc/resolv.conf
ls: 无法访问/etc/resolv.conf: 没有那个文件或目录
这个报错比较明显,解决方式同场景一
3. 服务器重启后coredns异常
coredns部署后状态在CrashLoopBackOff和Error不停转换
[root@02-3 ~]# kubectl get pods -owide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-5mftn 1/1 Running 2 174d 172.27.0.214 10.10.28.10 <none> <none>
coredns-9bnl8 1/1 Running 2 62d 172.27.4.31 10.10.28.141 <none> <none>
coredns-n2knx 0/1 CrashLoopBackOff 2 37s 172.27.2.167 10.10.28.116 <none> <none>
查看coredns的pod日志
[root@02-3 ~]# kubectl describe pods coredns-n2knx -n kube-system
.....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m4s default-scheduler Successfully assigned kube-system/coredns-n2knx to 10.10.28.116
Warning BackOff 31s (x10 over 2m1s) kubelet Back-off restarting failed container
Normal Pulled 17s (x5 over 2m4s) kubelet Container image "reg.wps.lan:5000/wps/coredns:1.8.3" already present on machine
Normal Created 17s (x5 over 2m4s) kubelet Created container coredns
Normal Started 17s (x5 over 2m3s) kubelet Started container coredns
从日志上看到coredns无法启动,直接查看docker的日志
[root@02-3 ~]# docker ps -a | grep coredns
d8bfd8b06015 3885a5b7f138 "/coredns -conf /etc…" 26 seconds ago Exited (1) 25 seconds ago k8s_coredns_coredns-n2knx_kube-system_a78e5c88-6776-4a4e-ab48-d380b0f18c93_4
e7368c926cb0 reg.wps.lan:5000/wps/pause:3.3 "/pause" 2 minutes ago Up 2 minutes k8s_POD_coredns-n2knx_kube-system_a78e5c88-6776-4a4e-ab48-d380b0f18c93_0
[root@02-3 ~]#
[root@02-3 ~]# docker logs -f d8bfd8b06015
.:53
[INFO] plugin/reload: Running configuration MD5 = 4fe39d48e613d0d0369935d7aafb5fa6
CoreDNS-1.8.3
linux/amd64, go1.16, 4293992
[FATAL] plugin/loop: Loop (127.0.0.1:59055 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 184772892781744701.6546771560531965938."
检查dns配置,nameserver的dns配置成了127.0.0.1,修改成正确的dns地址即可
[root@02-3 ~]# cat /etc/resolv.conf
; generated by /usr/sbin/dhclient-script
search openstacklocal novalocal
nameserver 127.0.0.1
© 版权声明
本站文章由不念博客原创,未经允许严禁转载!
THE END