Python的HTML編碼\ xc2 \ xa0

我一直在努力與這一段時間。我正在嘗試將字符串寫入HTML，但一旦我清除了它們，就會遇到格式問題。這裏有一個例子：Python的HTML編碼 xc2 xa0

paragraphs = ['Grocery giant and household name Woolworths is battered and bruised. ', 
'But behind the problems are still the makings of a formidable company'] 

x = str(" ") 
for item in paragraphs: 
    x = x + str(item) 
x

輸出：

"Grocery giant and household name\xc2\xa0Woolworths is battered and\xc2\xa0bruised. 
But behind the problems are still the makings of a formidable\xc2\xa0company"

所需的輸出：

"Grocery giant and household name Woolworths is battered and bruised. 
But behind the problems are still the makings of a formidable company"

我希望你能解釋，爲什麼出現這種情況，我該如何解決。提前致謝！

來源

2015-09-06 Sam Perry

您是否檢查過源字符串中不正常的Unicode空白字符？ –

\ XC2 \ XA0意味着爲0xC2 0XA0就是所謂

不間斷空格

它是UTF-8編碼的一種無形的控制字符。更多關於它的信息請查看wikipedia：https://en.wikipedia.org/wiki/Non-breaking_space

我複製了你粘貼的問題並獲得了預期的輸出結果。

來源

2015-09-06 03:32:49 liuyix

謝謝。這解決了它。我建在： x.replace（「\ xc2 \ xa0」，「」） –

Python的HTML編碼\ xc2 \ xa0

回答

相關問題