我正在運行make來編譯python項目中的C庫,並使用python(python 3.3)pexpect for automation part。所以make命令的輸出是通過pexpect在塊中讀取的,並且在一個這樣的塊中它會在pexpect嘗試將(python 3字節)轉換爲(python3's str)類型時拋出以下錯誤。主要問題是這個問題是間歇性的,不經常發生。當make命令運行以編譯C庫時,Pexpect拋出unicode解碼錯誤
UnicodeDecodeError錯誤:「UTF-8」的編解碼器在1998-1999位置不能解碼字節:數據
意外結束 - >下面的示例代碼示出了當數據包含多字節字符(即特殊字符或任何unicode數據)。 Pexpect在處理多字節字符的部分數據時無法解碼。
#!/usr/bin/python
# -*- coding: utf-8 -*-
from base import pexpect
MAX_READ_CHUNK = 8
def run(cmd):
child = pexpect.spawn(cmd, maxread=MAX_READ_CHUNK)
while True:
i = child.expect([pexpect.EOF,pexpect.TIMEOUT])
if child.before:
print(child.before)
if i == 0: # EOF
break
elif i == 1: # TIMEOUT
continue
child.close()
return child.exitstatus
############## Main ################
data='「HELLO WORLD」'
#i.e. data = b'\xe2\x80\x9cabcd\xe2\x80\x9d'
print("Data in readable form = %s "%data)
print("Data in bytes = %s \n\n"%data.encode('utf-8'))
run("echo %s"%data)
以下回溯錯誤來了:
Data in readable form = 「HELLO WORLD」
Data in bytes = b'\xe2\x80\x9cHELLO WORLD\xe2\x80\x9d'
_cast_unicode() enc=[utf-8] s=[b'\xe2\x80\x9cHELLO']
_cast_unicode() enc=[utf-8] s=[b' WORLD\xe2\x80']
Traceback (most recent call last):
File "test.py", line 33, in <module>
run("echo %s"%data)
File "test.py", line 11, in run
i = child.expect([pexpect.EOF,pexpect.TIMEOUT])
File "/home/test/Downloads/base/pexpect.py", line 1358, in expect
return self.expect_list(compiled_pattern_list, timeout, searchwindowsize)
File "/home/test/Downloads/base/pexpect.py", line 1372, in expect_list
return self.expect_loop(searcher_re(pattern_list), timeout, searchwindowsize)
File "/home/test/Downloads/base/pexpect.py", line 1425, in expect_loop
c = self.read_nonblocking (self.maxread, timeout)
File "/home/test/Downloads/base/pexpect.py", line 1631, in read_nonblocking
return super(spawn, self).read_nonblocking(size=size, timeout=timeout)\
File "/home/test/Downloads/base/pexpect.py", line 868, in read_nonblocking
s2 = self._cast_buffer_type(s)
File "/home/test/Downloads/base/pexpect.py", line 1614, in _cast_buffer_type
return _cast_unicode(s, self.encoding)
File "/home/test/Downloads/base/pexpect.py", line 156, in _cast_unicode
return s.decode(enc)
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 6-7:
unexpected end of data
當MAX_READ_CHUNK值在上面的代碼更改爲9,這是工作的罰款。
# Output When "MAX_READ_CHUNK = 9"
Data in readable form = 「HELLO WORLD」
Data in bytes = b'\xe2\x80\x9cHELLO WORLD\xe2\x80\x9d'
_cast_unicode() enc=[utf-8] s=[b'\xe2\x80\x9cHELLO ']
_cast_unicode() enc=[utf-8] s=[b'WORLD\xe2\x80\x9d\r']
_cast_unicode() enc=[utf-8] s=[b'\n']
「HELLO WORLD」
如何處理這個「的UnicodeDecodeError:數據意外結束:‘UTF-8’編解碼器不能解碼位置字節」化妝過程中Pexpect的。
這是一個bug - 應該在即將推出的pexpect 3.0中修復。 –