我使用kubeadm
創建了5個VM(1個主站和4個從站,運行Ubuntu 16.04.3 LTS)的K8s集羣。我使用flannel
在集羣中建立網絡。我能夠成功部署一個應用程序。然後,我通過NodePort服務公開它。從這裏,事情變得複雜了。K8s NodePort服務僅在集羣中的2/4個從站上「無法通過IP訪問」
在我開始之前,我禁用了主節點和節點上的默認firewalld
服務。
據我所知,K8s Services doc類型的NodePort暴露集羣中所有節點上的服務。但是,當我創建它時,該服務僅暴露在羣集中4箇中的2個節點上。我猜這不是預期的行爲(是嗎?)
爲了排除故障,這裏有一些資源規格:
[email protected]:~# kubectl get nodes
NAME STATUS AGE VERSION
vm-deepejai-00b Ready 5m v1.7.3
vm-plashkar-006 Ready 4d v1.7.3
vm-rosnthom-00f Ready 4d v1.7.3
vm-vivekse-003 Ready 4d v1.7.3 //the master
vm-vivekse-004 Ready 16h v1.7.3
[email protected]:~# kubectl get pods -o wide -n playground
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-bootcamp-2457653786-9qk80 1/1 Running 0 2d 10.244.3.6 vm-rosnthom-00f
springboot-helloworld-2842952983-rw0gc 1/1 Running 0 1d 10.244.3.7 vm-rosnthom-00f
[email protected]:~# kubectl get svc -o wide -n playground
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
sb-hw-svc 10.101.180.19 <nodes> 9000:30847/TCP 5h run=springboot-helloworld
[email protected]:~# kubectl describe svc sb-hw-svc -n playground
Name: sb-hw-svc
Namespace: playground
Labels: <none>
Annotations: <none>
Selector: run=springboot-helloworld
Type: NodePort
IP: 10.101.180.19
Port: <unset> 9000/TCP
NodePort: <unset> 30847/TCP
Endpoints: 10.244.3.7:9000
Session Affinity: None
Events: <none>
[email protected]:~# kubectl get endpoints sb-hw-svc -n playground -o yaml
apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: 2017-08-09T06:28:06Z
name: sb-hw-svc
namespace: playground
resourceVersion: "588958"
selfLink: /api/v1/namespaces/playground/endpoints/sb-hw-svc
uid: e76d9cc1-7ccb-11e7-bc6a-fa163efaba6b
subsets:
- addresses:
- ip: 10.244.3.7
nodeName: vm-rosnthom-00f
targetRef:
kind: Pod
name: springboot-helloworld-2842952983-rw0gc
namespace: playground
resourceVersion: "473859"
uid: 16d9db68-7c1a-11e7-bc6a-fa163efaba6b
ports:
- port: 9000
protocol: TCP
一些修修補補,我意識到,那些2「故障」的節點,這些服務並不之後可以從這些主機內部獲得。
NODE01(工作):
[email protected]:~# curl 127.0.0.1:30847 //<localhost>:<nodeport>
Hello Docker World!!
[email protected]:~# curl 10.101.180.19:9000 //<cluster-ip>:<port>
Hello Docker World!!
[email protected]:~# curl 10.244.3.7:9000 //<pod-ip>:<port>
Hello Docker World!!
NODE02(工作):
[email protected]:~# curl 127.0.0.1:30847
Hello Docker World!!
[email protected]:~# curl 10.101.180.19:9000
Hello Docker World!!
[email protected]:~# curl 10.244.3.7:9000
Hello Docker World!!
Node03(不工作):
[email protected]:~# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
[email protected]:~# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
[email protected]:~# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out
Node04(不工作):
[email protected]:/# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
[email protected]:/# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
[email protected]:/# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out
在所有4個從站上嘗試過netstat
和telnet
。下面是輸出:
NODE01(工作主機):
[email protected]:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 27808/kube-proxy
[email protected]:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
NODE02(工作主機):
[email protected]:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 11842/kube-proxy
[email protected]:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Node03(在不工作的主機):
[email protected]:~# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 7791/kube-proxy
[email protected]:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out
Node04(非工作主機):
[email protected]:/# netstat -tulpn | grep 30847
tcp6 0 0 :::30847 :::* LISTEN 689/kube-proxy
[email protected]:/# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out
加成信息:
從kubectl get pods
輸出,我可以看到,吊艙實際上是部署在從vm-rosnthom-00f
。我能夠從所有5臺虛擬機中獲得該主機的ping
,並且所有虛擬機都可以使用curl vm-rosnthom-00f:30847
。
我可以清楚地看到內部集羣網絡混亂了,但我不確定如何解決它!所有從站的iptables -L
都是相同的,甚至本地環回(ifconfig lo
)已啓動並運行於所有從站。我完全不知道如何解決它!
只是爲了確認,做所有的非泊塢窗接口的IP地址有一個獨立的IP地址空間比碼頭工人,豆莢和服務?我想看到的命令是'root @ vm-deepejai-00b:/#curl THE_IP_OF_vm-vivekse-004:30847',以確保'vm-deepejai-00b'能想象到將流量路由到'vm-vivekse-004' ,因爲無論如何 –
下面是發生了什麼問題另外,爲了清楚起見,你是否檢查過'iptables -t nat -L'以及'iptables -L'(我無法確定這是你的意思) –
@MatthewLDaniel關於你的第一個評論,捲曲的工作原理: '根@ VM-deepejai-00B:〜#捲曲173.36.23.4:30847 你好泊塢世界!!' 其中173.36.23.4是VM-的IP vivekse-004 –