K8s NodePort服務僅在集羣中的2/4個從站上「無法通過IP訪問」

我使用kubeadm創建了5個VM（1個主站和4個從站，運行Ubuntu 16.04.3 LTS）的K8s集羣。我使用flannel在集羣中建立網絡。我能夠成功部署一個應用程序。然後，我通過NodePort服務公開它。從這裏，事情變得複雜了。K8s NodePort服務僅在集羣中的2/4個從站上「無法通過IP訪問」

在我開始之前，我禁用了主節點和節點上的默認firewalld服務。

據我所知，K8s Services doc類型的NodePort暴露集羣中所有節點上的服務。但是，當我創建它時，該服務僅暴露在羣集中4箇中的2個節點上。我猜這不是預期的行爲（是嗎？）

爲了排除故障，這裏有一些資源規格：

[email protected]:~# kubectl get nodes 
NAME    STATUS AGE  VERSION 
vm-deepejai-00b Ready  5m  v1.7.3 
vm-plashkar-006 Ready  4d  v1.7.3 
vm-rosnthom-00f Ready  4d  v1.7.3 
vm-vivekse-003 Ready  4d  v1.7.3 //the master 
vm-vivekse-004 Ready  16h  v1.7.3 

[email protected]:~# kubectl get pods -o wide -n playground 
NAME          READY  STATUS RESTARTS AGE  IP   NODE 
kubernetes-bootcamp-2457653786-9qk80  1/1  Running 0   2d  10.244.3.6 vm-rosnthom-00f 
springboot-helloworld-2842952983-rw0gc 1/1  Running 0   1d  10.244.3.7 vm-rosnthom-00f 

[email protected]:~# kubectl get svc -o wide -n playground 
NAME  CLUSTER-IP  EXTERNAL-IP PORT(S)   AGE  SELECTOR 
sb-hw-svc 10.101.180.19 <nodes>  9000:30847/TCP 5h  run=springboot-helloworld 

[email protected]:~# kubectl describe svc sb-hw-svc -n playground 
Name:    sb-hw-svc 
Namespace:   playground 
Labels:    <none> 
Annotations:  <none> 
Selector:   run=springboot-helloworld 
Type:    NodePort 
IP:     10.101.180.19 
Port:    <unset> 9000/TCP 
NodePort:   <unset> 30847/TCP 
Endpoints:   10.244.3.7:9000 
Session Affinity: None 
Events:    <none> 

[email protected]:~# kubectl get endpoints sb-hw-svc -n playground -o yaml 
apiVersion: v1 
kind: Endpoints 
metadata: 
    creationTimestamp: 2017-08-09T06:28:06Z 
    name: sb-hw-svc 
    namespace: playground 
    resourceVersion: "588958" 
    selfLink: /api/v1/namespaces/playground/endpoints/sb-hw-svc 
    uid: e76d9cc1-7ccb-11e7-bc6a-fa163efaba6b 
subsets: 
- addresses: 
    - ip: 10.244.3.7 
    nodeName: vm-rosnthom-00f 
    targetRef: 
     kind: Pod 
     name: springboot-helloworld-2842952983-rw0gc 
     namespace: playground 
     resourceVersion: "473859" 
     uid: 16d9db68-7c1a-11e7-bc6a-fa163efaba6b 
    ports: 
    - port: 9000 
    protocol: TCP

一些修修補補，我意識到，那些2「故障」的節點，這些服務並不之後可以從這些主機內部獲得。

NODE01（工作）：

[email protected]:~# curl 127.0.0.1:30847  //<localhost>:<nodeport> 
Hello Docker World!! 
[email protected]:~# curl 10.101.180.19:9000 //<cluster-ip>:<port> 
Hello Docker World!! 
[email protected]:~# curl 10.244.3.7:9000  //<pod-ip>:<port> 
Hello Docker World!!

NODE02（工作）：

[email protected]:~# curl 127.0.0.1:30847 
Hello Docker World!! 
[email protected]:~# curl 10.101.180.19:9000 
Hello Docker World!! 
[email protected]:~# curl 10.244.3.7:9000 
Hello Docker World!!

Node03（不工作）：

[email protected]:~# curl 127.0.0.1:30847 
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out 
[email protected]:~# curl 10.101.180.19:9000 
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out 
[email protected]:~# curl 10.244.3.7:9000 
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out

Node04（不工作）：

[email protected]:/# curl 127.0.0.1:30847 
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out 
[email protected]:/# curl 10.101.180.19:9000 
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out 
[email protected]:/# curl 10.244.3.7:9000 
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out

在所有4個從站上嘗試過netstat和telnet。下面是輸出：

NODE01（工作主機）：

[email protected]:~# netstat -tulpn | grep 30847 
tcp6  0  0 :::30847    :::*     LISTEN  27808/kube-proxy 
[email protected]:~# telnet 127.0.0.1 30847 
Trying 127.0.0.1... 
Connected to 127.0.0.1. 
Escape character is '^]'.

NODE02（工作主機）：

[email protected]:~# netstat -tulpn | grep 30847 
tcp6  0  0 :::30847    :::*     LISTEN  11842/kube-proxy 
[email protected]:~# telnet 127.0.0.1 30847 
Trying 127.0.0.1... 
Connected to 127.0.0.1. 
Escape character is '^]'.

Node03（在不工作的主機）：

[email protected]:~# netstat -tulpn | grep 30847 
tcp6  0  0 :::30847    :::*     LISTEN  7791/kube-proxy 
[email protected]:~# telnet 127.0.0.1 30847 
Trying 127.0.0.1... 
telnet: Unable to connect to remote host: Connection timed out

Node04（非工作主機）：

[email protected]:/# netstat -tulpn | grep 30847 
tcp6  0  0 :::30847    :::*     LISTEN  689/kube-proxy 
[email protected]:/# telnet 127.0.0.1 30847 
Trying 127.0.0.1... 
telnet: Unable to connect to remote host: Connection timed out

加成信息：

從kubectl get pods輸出，我可以看到，吊艙實際上是部署在從vm-rosnthom-00f。我能夠從所有5臺虛擬機中獲得該主機的ping，並且所有虛擬機都可以使用curl vm-rosnthom-00f:30847。

我可以清楚地看到內部集羣網絡混亂了，但我不確定如何解決它！所有從站的iptables -L都是相同的，甚至本地環回（ifconfig lo）已啓動並運行於所有從站。我完全不知道如何解決它！

來源

2017-08-09 Vivek Sethi

只是爲了確認，做所有的非泊塢窗接口的IP地址有一個獨立的IP地址空間比碼頭工人，豆莢和服務？我想看到的命令是'root @ vm-deepejai-00b：/＃curl THE_IP_OF_vm-vivekse-004：30847'，以確保'vm-deepejai-00b'能想象到將流量路由到'vm-vivekse-004' ，因爲無論如何 –

下面是發生了什麼問題另外，爲了清楚起見，你是否檢查過'iptables -t nat -L'以及'iptables -L'（我無法確定這是你的意思） –

@MatthewLDaniel關於你的第一個評論，捲曲的工作原理： '根@ VM-deepejai-00B：〜＃捲曲173.36.23.4:30847 你好泊塢世界!!' 其中173.36.23.4是VM-的IP vivekse-004 –

-3

如果您想從羣集中的任何節點到達服務，您需要的服務類型爲ClusterIP。由於您將服務類型定義爲NodePort，因此可以從運行服務的節點進行連接。

我上面的回答是不正確的，基於文檔，我們應該能夠從任何連接NodeIP:Nodeport。但它也不在我的集羣中工作。

https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services---service-types

NodePort：在靜態端口（ NodePort）自曝在每個節點上的IP服務。自動創建一個ClusterIP服務，NodePort服務將路由到該服務。通過請求，您將能夠從集羣外部聯繫節點端口服務：。

我的一個節點ip轉發沒有設置。我能夠連接使用NodeIP我的服務：nodePort

sysctl -w net.ipv4.ip_forward=1

來源

2017-08-10 02:49:49 sfgroups

K8s NodePort服務僅在集羣中的2/4個從站上「無法通過IP訪問」

回答

相關問題