2017-08-09 80 views
10

我使用kubeadm創建了5個VM(1個主站和4個從站,運行Ubuntu 16.04.3 LTS)的K8s集羣。我使用flannel在集羣中建立網絡。我能夠成功部署一個應用程序。然後,我通過NodePort服務公開它。從這裏,事情變得複雜了。K8s NodePort服務僅在集羣中的2/4個從站上「無法通過IP訪問」

在我開始之前,我禁用了主節點和節點上的默認firewalld服務。

據我所知,K8s Services doc類型的NodePort暴露集羣中所有節點上的服務。但是,當我創建它時,該服務僅暴露在羣集中4箇中的2個節點上。我猜這不是預期的行爲(是嗎?)

爲了排除故障,這裏有一些資源規格:

[email protected]:~# kubectl get nodes 
NAME    STATUS AGE  VERSION 
vm-deepejai-00b Ready  5m  v1.7.3 
vm-plashkar-006 Ready  4d  v1.7.3 
vm-rosnthom-00f Ready  4d  v1.7.3 
vm-vivekse-003 Ready  4d  v1.7.3 //the master 
vm-vivekse-004 Ready  16h  v1.7.3 

[email protected]:~# kubectl get pods -o wide -n playground 
NAME          READY  STATUS RESTARTS AGE  IP   NODE 
kubernetes-bootcamp-2457653786-9qk80  1/1  Running 0   2d  10.244.3.6 vm-rosnthom-00f 
springboot-helloworld-2842952983-rw0gc 1/1  Running 0   1d  10.244.3.7 vm-rosnthom-00f 

[email protected]:~# kubectl get svc -o wide -n playground 
NAME  CLUSTER-IP  EXTERNAL-IP PORT(S)   AGE  SELECTOR 
sb-hw-svc 10.101.180.19 <nodes>  9000:30847/TCP 5h  run=springboot-helloworld 

[email protected]:~# kubectl describe svc sb-hw-svc -n playground 
Name:    sb-hw-svc 
Namespace:   playground 
Labels:    <none> 
Annotations:  <none> 
Selector:   run=springboot-helloworld 
Type:    NodePort 
IP:     10.101.180.19 
Port:    <unset> 9000/TCP 
NodePort:   <unset> 30847/TCP 
Endpoints:   10.244.3.7:9000 
Session Affinity: None 
Events:    <none> 

[email protected]:~# kubectl get endpoints sb-hw-svc -n playground -o yaml 
apiVersion: v1 
kind: Endpoints 
metadata: 
    creationTimestamp: 2017-08-09T06:28:06Z 
    name: sb-hw-svc 
    namespace: playground 
    resourceVersion: "588958" 
    selfLink: /api/v1/namespaces/playground/endpoints/sb-hw-svc 
    uid: e76d9cc1-7ccb-11e7-bc6a-fa163efaba6b 
subsets: 
- addresses: 
    - ip: 10.244.3.7 
    nodeName: vm-rosnthom-00f 
    targetRef: 
     kind: Pod 
     name: springboot-helloworld-2842952983-rw0gc 
     namespace: playground 
     resourceVersion: "473859" 
     uid: 16d9db68-7c1a-11e7-bc6a-fa163efaba6b 
    ports: 
    - port: 9000 
    protocol: TCP 

一些修修補補,我意識到,那些2「故障」的節點,這些服務並不之後可以從這些主機內部獲得。

NODE01(工作):

[email protected]:~# curl 127.0.0.1:30847  //<localhost>:<nodeport> 
Hello Docker World!! 
[email protected]:~# curl 10.101.180.19:9000 //<cluster-ip>:<port> 
Hello Docker World!! 
[email protected]:~# curl 10.244.3.7:9000  //<pod-ip>:<port> 
Hello Docker World!! 

NODE02(工作):

[email protected]:~# curl 127.0.0.1:30847 
Hello Docker World!! 
[email protected]:~# curl 10.101.180.19:9000 
Hello Docker World!! 
[email protected]:~# curl 10.244.3.7:9000 
Hello Docker World!! 

Node03(不工作):

[email protected]:~# curl 127.0.0.1:30847 
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out 
[email protected]:~# curl 10.101.180.19:9000 
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out 
[email protected]:~# curl 10.244.3.7:9000 
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out 

Node04(不工作):

[email protected]:/# curl 127.0.0.1:30847 
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out 
[email protected]:/# curl 10.101.180.19:9000 
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out 
[email protected]:/# curl 10.244.3.7:9000 
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out 

在所有4個從站上嘗試過netstattelnet。下面是輸出:

NODE01(工作主機):

[email protected]:~# netstat -tulpn | grep 30847 
tcp6  0  0 :::30847    :::*     LISTEN  27808/kube-proxy 
[email protected]:~# telnet 127.0.0.1 30847 
Trying 127.0.0.1... 
Connected to 127.0.0.1. 
Escape character is '^]'. 

NODE02(工作主機):

[email protected]:~# netstat -tulpn | grep 30847 
tcp6  0  0 :::30847    :::*     LISTEN  11842/kube-proxy 
[email protected]:~# telnet 127.0.0.1 30847 
Trying 127.0.0.1... 
Connected to 127.0.0.1. 
Escape character is '^]'. 

Node03(在不工作的主機):

[email protected]:~# netstat -tulpn | grep 30847 
tcp6  0  0 :::30847    :::*     LISTEN  7791/kube-proxy 
[email protected]:~# telnet 127.0.0.1 30847 
Trying 127.0.0.1... 
telnet: Unable to connect to remote host: Connection timed out 

Node04(非工作主機):

[email protected]:/# netstat -tulpn | grep 30847 
tcp6  0  0 :::30847    :::*     LISTEN  689/kube-proxy 
[email protected]:/# telnet 127.0.0.1 30847 
Trying 127.0.0.1... 
telnet: Unable to connect to remote host: Connection timed out 

加成信息:

kubectl get pods輸出,我可以看到,吊艙實際上是部署在從vm-rosnthom-00f。我能夠從所有5臺虛擬機中獲得該主機的ping,並且所有虛擬機都可以使用curl vm-rosnthom-00f:30847

我可以清楚地看到內部集羣網絡混亂了,但我不確定如何解決它!所有從站的iptables -L都是相同的,甚至本地環回(ifconfig lo)已啓動並運行於所有從站。我完全不知道如何解決它!

+0

只是爲了確認,做所有的非泊塢窗接口的IP地址有一個獨立的IP地址空間比碼頭工人,豆莢和服務?我想看到的命令是'root @ vm-deepejai-00b:/#curl THE_IP_OF_vm-vivekse-004:30847',以確保'vm-deepejai-00b'能想象到將流量路由到'vm-vivekse-004' ,因爲無論如何 –

+0

下面是發生了什麼問題另外,爲了清楚起見,你是否檢查過'iptables -t nat -L'以及'iptables -L'(我無法確定這是你的意思) –

+0

@MatthewLDaniel關於你的第一個評論,捲曲的工作原理: '根@ VM-deepejai-00B:〜#捲曲173.36.23.4:30847 你好泊塢世界!!' 其中173.36.23.4是VM-的IP vivekse-004 –

回答

-3

如果您想從羣集中的任何節點到達服務,您需要的服務類型爲ClusterIP。由於您將服務類型定義爲NodePort,因此可以從運行服務的節點進行連接。


我上面的回答是不正確的,基於文檔,我們應該能夠從任何連接NodeIP:Nodeport。但它也不在我的集羣中工作。

https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services---service-types

NodePort:在靜態端口( NodePort)自曝在每個節點上的IP服務。自動創建一個ClusterIP服務,NodePort服務將路由到該服務。通過請求 ,您將能夠從集羣外部聯繫 節點端口服務:。

我的一個節點ip轉發沒有設置。我能夠連接使用NodeIP我的服務:nodePort

sysctl -w net.ipv4.ip_forward=1 
相關問題