2013-03-27 82 views
0

我遇到了ipython集羣的怪異行爲。計算結束,但許多結果永遠不會到達客戶端(並且在完成第一次計算後,引擎只是閒置)。ipython 0.13 zmq錯誤

我懷疑東西是錯誤的,因爲ZMQ 1)不時我看到了以下錯誤:

File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/asyncresult.py", line 118, in get 
    if not self.ready(): 
    File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/asyncresult.py", line 132, in ready 
    self.wait(0) 
    File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/asyncresult.py", line 142, in wait 
    self._ready = self._client.wait(self.msg_ids, timeout) 
    File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/client.py", line 1058, in wait 
    self.spin() 
    File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/client.py", line 1015, in spin 
    self._flush_results(self._task_socket) 
    File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/client.py", line 814, in _flush_results 
    idents,msg = self.session.recv(sock, mode=zmq.NOBLOCK) 
    File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/zmq/session.py", line 642, in recv 
    idents, msg_list = self.feed_identities(msg_list, copy) 
    File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/zmq/session.py", line 673, in feed_identities 
    idx = msg_list.index(DELIM) 
ValueError: '<IDS|MSG>' is not in list 

Additionally IPython.zmq has two test failures: 

====================================================================== 
ERROR: test_send (IPython.zmq.tests.test_session.TestSession) 
---------------------------------------------------------------------- 
Traceback (most recent call last): 
    File "/clusterdata/python/env_stable/lib/python2.7/site-packages/IPython/zmq/tests/test_session.py", line 76, in test_send 
    socket = MockSocket(zmq.Context.instance(),zmq.PAIR) 
    File "/clusterdata/python/env_stable/lib/python2.7/site-packages/IPython/zmq/tests/test_session.py", line 34, in __init__ 
    self.data = [] 
    File "/clusterdata/python/env_stable/lib/python2.7/site-packages/zmq/sugar/attrsettr.py", line 38, in __setattr__ 
    self.__class__.__name__, upper_key) 
AttributeError: MockSocket has no such option: DATA 

====================================================================== 
ERROR: test_send (IPython.zmq.tests.test_session.TestSession) 
---------------------------------------------------------------------- 
Traceback (most recent call last): 
    File "/clusterdata/python/env_stable/lib/python2.7/site-packages/zmq/tests/__init__.py", line 108, in tearDown 
    raise RuntimeError("context could not terminate, open sockets likely remain in test") 
RuntimeError: context could not terminate, open sockets likely remain in test 

---------------------------------------------------------------------- 

我用pyzmq 13.0.0(如安裝由PIP),以及zeromq 3.2.2 ,由pyzmq的設置編譯。我使用ipython 13.1和python 2.7.3。

這是什麼可能的任何建議,如果不是我怎麼能找出更多的信息爲什麼會發生這些錯誤?

更新:事實證明,減速是由於ipcontroller的長任務隊列,然後採取100%的CPU和滯後可怕。這是一個單獨的問題,但我仍然會對上述的反饋感到滿意。

+0

MockSocket錯誤隻影響測試本身,並在0.13.2 [此處發佈候選版本](http://archive.ipython.org/testing/0.13.2)中修復。 – minrk 2013-03-27 21:34:53

+0

不知道其他錯誤可能是什麼?另外,根據更新,ipcontroller是否應該像地獄一樣滯留4000個職位(如果它不會滯後幾百個)? – 2013-03-27 22:09:32

+0

這顯然不應該,但這並不意味着你的系統出了問題。如果你有很多工作,我強烈建議將TaskScheduler.hwm設置爲大。 – minrk 2013-03-28 05:09:13

回答

0

回覆@minrk的留言。 ZMQ錯誤是不重要的,性能是由於計劃,並通過設置TaskScheduler.hwm=0解決。