如何在Python 3中的字節和字符串之間進行轉換？

這是一個Python 101類型的問題，但當我嘗試使用一個似乎將字符串輸入轉換爲字節的包時，它讓我感到困惑。如何在Python 3中的字節和字符串之間進行轉換？

正如您將在下面看到的，我爲自己找到了答案，但我覺得這值得在這裏記錄，因爲我花時間挖掘發生了什麼事情。它似乎對Python 3是通用的，所以我沒有提到我正在玩的原始包;它似乎並不是一個錯誤（只是特定的包有.tostring()方法顯然不生產什麼，我理解爲一個字符串...）

我的測試程序是這樣的：

import mangler         # spoof package 

stringThing = """ 
<Doc> 
    <Greeting>Hello World</Greeting> 
    <Greeting>你好</Greeting> 
</Doc> 
""" 

# print out the input 
print('This is the string input:') 
print(stringThing) 

# now make the string into bytes 
bytesThing = mangler.tostring(stringThing) # pseudo-code again 

# now print it out 
print('\nThis is the bytes output:') 
print(bytesThing)

從這個代碼的輸出給出了這樣的：

This is the string input: 

<Doc> 
    <Greeting>Hello World</Greeting> 
    <Greeting>你好</Greeting> 
</Doc> 


This is the bytes output: 
b'\n<Doc>\n <Greeting>Hello World</Greeting>\n <Greeting>\xe4\xbd\xa0\xe5\xa5\xbd</Greeting>\n</Doc>\n'

因此，有必要能夠字節和字符串之間進行轉換，以避免與非ASCII字符正在變成官樣文章結束了。

來源

2012-12-23 Bobble

[This question]（http://stackoverflow.com/questions/7585435/best-way-to-convert-string-to-bytes-in-python-3）在答案中給出了更多細節，但我認爲下面的簡要回答更加清晰。 – Bobble

上面的代碼示例中的「壓榨機」在做的這相當於：

bytesThing = stringThing.encode(encoding='UTF-8')

還有其他的方法來寫這個（特別是使用bytes(stringThing, encoding='UTF-8')，但上面的語法使得很明顯是怎麼回事，並且還做什麼來恢復字符串：

newStringThing = bytesThing.decode(encoding='UTF-8')

當我們這樣做，原始的字符串恢復

注意，使用str(bytesThing)只是轉錄所有gobbledegook而不將其轉換回Unicode，除非您特別請求UTF-8，即str(bytesThing, encoding='UTF-8')。如果未指定編碼，則不報告錯誤。

來源

2012-12-23 11:22:01 Bobble

如果你看看實際的方法實現，你會發現'utf-8'是默認編碼，因爲你知道編碼確實是'utf-8'，即'stringThing.encode（）'和'bytesThing.decode（）'將會很好。 – ccpizza

@ccpizza在上面的例子中明確地給出了編碼，這使得它更加清晰，恕我直言是一種很好的做法。並非所有的unicode都是UTF-8。它也避免了最後一段提到的沉默失敗。 – Bobble

完全同意;顯式比隱式更好，但是我們很好地知道**是什麼**隱含的。是否使用它是另一個問題。僅僅因爲你可以並不意味着你應該:) – ccpizza

在python3中，有一個與encode()格式相同的bytes()方法。

str1 = b'hello world' 
str2 = bytes("hello world", encoding="UTF-8") 
print(str1 == str2) # Returns True

我在文檔中沒有讀到任何關於此的信息，但也許我沒有在正確的位置尋找。通過這種方式，您可以明確地將字符串轉換爲字節流，並且使其可讀性高於使用encode和decode，並且不必在引號前面指定優先b。

來源

2014-05-22 19:03:22 NuclearPeon

嘗試：

StringVariable=ByteVariable.decode('UTF-8','ignore')

TO測試類型：

print(type(StringVariable))

這裏StringVariable'表示爲一個字符串。 'ByteVariable'表示爲Byte。它不相關的問題變量..

來源

2017-09-29 17:31:37

如何在Python 3中的字節和字符串之間進行轉換？

回答

相關問題