UnicodeError替換不工作 - Python

-4

我想用_替換nonunicode字符，但是這個程序儘管編譯沒有錯誤，但不能解決問題，我無法確定原因。UnicodeError替換不工作 - Python

import csv 
import unicodedata 
import pandas as pd 

df = pd.read_csv('/Users/pabbott/Desktop/Unicode.csv', sep = ',', 
index_col=False, converters={'ClinetEMail':str, 'ClientZip':str, 
'LocationZip':str, 'LicenseeName': str, 'LocationState':str, 
'AppointmentType':str, 'ClientCity':str, 'ClientState':str}) 

data = df 
for row in data: 
    for val in row: 
     try: 
      val.encode("utf-8") 
     except UnicodeDecodeError: 
      replace(val,"_") 

data.to_csv('UnicodeExport.csv', sep=',', index=False, 
quoting=csv.QUOTE_NONNUMERIC)

來源

2017-07-14 Pranav Abbott

什麼是您會收到錯誤？ – MattR

發佈代碼轉儲不是問題。 –

我沒有收到任何錯誤，因爲代碼編譯正確，但在新文件中，那些nonunicode字符沒有被_正確替換。我想知道這是否是data.apply函數的問題？ –

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa4 in position 4: invalid start byte

上述消息（來自pd.read_csv拋出）示出了該文件未保存在utf-8。您需要

要麼將文件保存爲utf-8，
或讀取使用正確編碼的文件。

例如（後者的變體），加encoding='windows-1252'到df = pd.read_csv(…如下：

df = pd.read_csv('/Users/pabbott/Desktop/Unicode.csv', sep = ',', encoding='windows-1252', 
index_col=False, converters={'ClinetEMail':str, 'ClientZip':str, 
'LocationZip':str, 'LicenseeName': str, 'LocationState':str, 
'AppointmentType':str, 'ClientCity':str, 'ClientState':str})

然後，您可以省略所有的東西 ~~try: val.encode("utf-8")~~ 在for row in data: for val in row:循環。

閱讀pandas.read_csv：

encoding : str , default None

Encoding to use for UTF when reading/writing (ex. 'utf-8'). List of Python standard encodings .

來源

2017-07-15 04:48:26 JosefZ

UnicodeError替換不工作 - Python

回答

相關問題