編碼問題，使用的docx

嘗試使用Python的docx度數符號添加到一個word文檔上打印Word文檔度的符號，我的函數的定義是這樣的：編碼問題，使用的docx

def convert_decimal_degrees2DMS(self,value): 
    #value = math.fabs(value) 
    degrees = int(value) 
    submin = math.fabs((value - int(value)) * 60) 
    minutes = int(submin) 
    subseconds = round(math.fabs((submin-int(submin)) * 60),1) 
    subseconds = int(subseconds) 
    self.angle = str(degrees) + " Degrees " + str(minutes) + " Minutes " +\ 
       str(subseconds)[0:2] + " Seconds " 
    #self.angle = str(degrees) + "-" + str(minutes) + "-" + str(subseconds) 
    #return str(degrees) + "-" + str(minutes) + "-" + str(subseconds) 
    #degree = u'\N{DEGREE SIGN}'.encode('utf-8') 
    return "{0}{1}{2}'{3}''".format(degrees,u'°'.encode('cp1252'),minutes,subseconds)

和錯誤，我繼續得到是這樣的：

File "lxml.etree.pyx", line 921, in lxml.etree._Element.text.__set__ (src\lxml\lxml.etree.c:41467) 
    File "apihelpers.pxi", line 652, in lxml.etree._setNodeText (src\lxml\lxml.etree.c:18888) 
    File "apihelpers.pxi", line 1335, in lxml.etree._utf8 (src\lxml\lxml.etree.c:24701) 
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters 
Exception AttributeError: "'NoneType' object has no attribute 'print_exc'" in <function _remove at 0x01E0F770> ignored

我已經嘗試了許多變化和一切都沒有工作，恐怕也可能會beucase的我缺乏有關，我沒有得到這個編碼的認識。

來源

2013-08-06 peztherez

函數返回的是我添加到word文檔中的內容。 – peztherez

u'°'.encode('cp1252')返回相當於'\xb0'的字節串（類型str）。同樣，在其他地方，您正在將東西轉換爲str。錯誤是告訴你需要unicode（Unicode碼點）而不是str（字節）類型的字符串。學位標誌本身可能不是問題。

的解決方案是簡單地提供Unicode字符串代替：所以u'°'代替u'°'.encode('cp1252')，和

self.angle = degrees + u" Degrees " + minutes + u" Minutes " + \ 
       subseconds[0:2] + u" Seconds "

代替

self.angle = str(degrees) + " Degrees " + str(minutes) + " Minutes " +\ 
       str(subseconds)[0:2] + " Seconds "

（假設degrees等是unicode型的，而不是str ）。注意Unicode字符串的u''語法，而不是字節字符串的''語法。

關於在Python源代碼中包含非ASCII字符，您必須記住的一件事是在PEP-0263中記錄的編碼標頭。所以，你會遵循家當帶編碼聲明：

#!/usr/bin/python 
# -*- coding: UTF-8 -*-

只要記住，使用PEP 0263不會奇蹟般地使str VS unicode走的兩重性。 '°'將在磁盤上的源代碼文件中找到str（字節串），因此不一定是長度1（如果ISO-8859-1，它相當於'\xb0'，如果是DOS cp437，則爲'\xf8'，如果UTF-8 ，至'\xc2\xb0'）。不管源代碼的編碼如何，u'°'將是Unicode代碼點U+00B0。

下面是源代碼中的非ASCII字符的圖示。這個例子看看源代碼的實際字節是很重要的。源代碼是UTF-8編碼的，因此'°'的長度爲2;畢竟它是一個字節串。

$ cat x.py 
#!/usr/bin/python 
# -*- coding: UTF-8 -*- 

print repr('°') 
print len('°') 
print len(u'°') 

$ od -c -txC x.py 
0000000 # ! / u s r / b i n / p y t h o 
      23 21 2f 75 73 72 2f 62 69 6e 2f 70 79 74 68 6f 
0000020 n \n #  - * -  c o d i n g :  
      6e 0a 23 20 2d 2a 2d 20 63 6f 64 69 6e 67 3a 20 
0000040 U T F - 8  - * - \n \n p r i n t 
      55 54 46 2d 38 20 2d 2a 2d 0a 0a 70 72 69 6e 74 
0000060  r e p r ( ' ° ** ' ) \n p r i n 
      20 72 65 70 72 28 27 c2 b0 27 29 0a 70 72 69 6e 
0000100 t  l e n ( ' ° ** ' ) \n p r i n 
      74 20 6c 65 6e 28 27 c2 b0 27 29 0a 70 72 69 6e 
0000120 t  l e n ( u ' ° ** ' ) \n    
      74 20 6c 65 6e 28 75 27 c2 b0 27 29 0a 
$ python x.py 
'\xc2\xb0' 
2 
1

來源

2013-08-06 22:42:33 wberry

編碼問題，使用的docx

回答

相關問題