python-寫入文件（忽略非ascii字符）

我在Linux上並希望將字符串（以utf-8）寫入txt文件。我嘗試了很多方法，但我總是得到一個錯誤：python-寫入文件（忽略非ascii字符）

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position in position 36: ordinal not in range(128)

有什麼辦法，如何寫文件只有ascii字符？並忽略非ASCII字符。我的代碼：

# -*- coding: UTF-8-*- 

import os 
import sys 


def __init__(self, dirname, speaker, file, exportFile): 

    text_file = open(exportFile, "a") 

    text_file.write(speaker.encode("utf-8")) 
    text_file.write(file.encode("utf-8")) 

    text_file.close()

謝謝。

來源

2014-03-04 user3375111

Strip_non-ascii_ characters before writing？ – devnull

你試過了嗎？speaker.encode（'utf-8'，errors ='ignore'）'？但是我相信你做錯了別的事情，因爲你*不應該首先出現這個錯誤。你能告訴我們什麼是「揚聲器」和「文件」？另外，如果要將二進制數據寫入文件，則應該以二進制模式打開文件：open（export_file，'ab'）'。 – Bakuriu

嘗試使用codecs模塊。

# -*- coding: UTF-8-*- 

import codecs 


def __init__(self, dirname, speaker, file, exportFile): 

    with codecs.open(exportFile, "a", 'utf-8') as text_file: 
     text_file.write(speaker.encode("utf-8")) 
     text_file.write(file.encode("utf-8"))

而且，當心你的file變量與內建file函數碰撞的名字。

最後，我建議你看看http://www.joelonsoftware.com/articles/Unicode.html更好地瞭解什麼是unicode的，這些網頁之一（根據您的Python版本），以瞭解如何在Python中使用它：

來源

2014-03-04 09:17:02

我嘗試了很多方法（還有編解碼器），但我總是得到相同的錯誤。所以，我想忽略非ascii字符並只寫入ascii文件。（我的程序中沒有名稱「file」的變量，這只是示例）。 – user3375111

似乎發生的事情是你的變量是''str''類型。所以當你執行'str.encode（'utf-8'）''時，python通過用系統默認編碼（python2中的ascii）對它進行編碼，自動將你的''str''轉換爲''unicode''。我想這是隱含的轉換失敗，因爲錯誤消息中提到了'ascii'。你確定*所有*你的變量的類型是''unicode''？ –

可以使用codecs模塊：

import codecs 
text_file = codecs.open(exportFile,mode='a',encoding='utf-8') 
text_file.write(...)

來源

2014-03-04 09:17:17

您可以在寫入之前解碼輸入字符串;

text = speaker.decode("utf8") 
with open(exportFile, "a") as text_file: 
    text_file.write(text.encode("utf-8")) 
    text_file.write(file.encode("utf-8"))

來源

2014-03-04 09:24:59

python-寫入文件（忽略非ascii字符）

回答

相關問題