0
我正在使用softRoCE上的Accelio。啓動opensm時出錯
Ib devices configured -
# ibv_devices
device node GUID
------ ----------------
rxe1 821f02fffef91598
rxe0 d6bed9fffebe94af
error while running the accelio client -
# xio_ow_client
=============================================
Server Address : 127.0.0.1
Server Port : 2061
Transport : rdma
Header Length : 32
Data Length : 32
Connection Index : 0
CPU Affinity : 0
Finite run : 0
=============================================
**** starting ...
session event: connection error. reason: No such device
# rping -c
rdma_resolve_route: No such device
因此檢查opensm狀態 - #/etc/init.d/opensmd狀態 opensm停止 #/etc/init.d/opensmd開始 opensm啓動[失敗]
# tail -f /var/log/opensm.log
Jul 09 15:04:45 655213 [AA4F3700] 0x03 -> OpenSM 3.3.7
Jul 09 15:04:45 692960 [AA4F3700] 0x80 -> OpenSM 3.3.7
Jul 09 15:04:45 693149 [AA4F3700] 0x02 -> osm_vendor_init: 1000 pending umads specified
Jul 09 15:04:45 797977 [AA4F3700] 0x80 -> Entering DISCOVERING state
Jul 09 15:04:45 799152 [AA4F3700] 0x02 -> osm_vendor_bind: Binding to port 0xd6bed9fffebe94af
Jul 09 15:04:45 800414 [AA4F3700] 0x01 -> osm_vendor_bind: ERR 5426: Unable to register class 129 version 1
Jul 09 15:04:45 800422 [AA4F3700] 0x01 -> osm_sm_mad_ctrl_bind: ERR 3118: Vendor specific bind failed
Jul 09 15:04:45 800425 [AA4F3700] 0x01 -> osm_sm_bind: ERR 2E10: SM MAD Controller bind failed (IB_ERROR)
Jul 09 15:04:45 800430 [AA4F3700] 0x01 -> osm_sa_mad_ctrl_unbind: ERR 1A11: No previous bind
Jul 09 15:04:45 829702 [AA4F3700] 0x80 -> Exiting SM
我會欣賞一些指針,以便我能夠理解我出錯的地方。
謝謝Shachar, – dhara
Shachar,我嘗試了上面的建議。似乎無法識別IP地址。 – dhara
服務器#RPING -s -a本地主機 客戶#RPING -c -a本地主機 rdma_resolve_route:沒有這樣的設備 服務器#RPING -s -a 10.213.41.231 rdma_bind_addr:沒有這樣的文件或目錄 – dhara