Python3將非英文字符轉換爲英文字符

我有一個文本文件，我讀取文件，並經過一些操作後，我把這些行放入另一個文件。但輸入文件有一些土耳其字符，如「İ，Ö，Ü，Ş，Ç，Ğ」。我希望將這些字符轉換爲英文字符，因爲當我以UTF-8編碼打開文件時，不會顯示這些字符。我的代碼如下：Python3將非英文字符轉換爲英文字符

for i in range (len(singleLine)): 
     if singleLine[i] == "İ": 
      singleLine.replace(singleLine[i:i+1],"I") 
     if singleLine[i] == "Ü": 
      singleLine.replace(singleLine[i:i + 1], "U") 
     if singleLine[i] == "Ö": 
      singleLine.replace(singleLine[i:i + 1], "O") 
     if singleLine[i] == "Ç": 
      singleLine.replace(singleLine[i:i + 1], "C") 
     if singleLine[i] == "Ş": 
      singleLine.replace(singleLine[i:i + 1], "S") 
     if singleLine[i] == "Ğ": 
      singleLine.replace(singleLine[i:i + 1], "G") 
    return singleLine

但代碼不會在輸入文件中識別出這些字符土耳其並把它們變成OUTPUTFILE無任何操作。

什麼是識別這些字符的方法？是否有任何特殊的方式來進行基於ASCII代碼的搜索或類似的東西？

來源

2016-06-08 abidinberkay

如評論：answer for switch case

我使用的方法爲：

choices = {"İ":"I", "ş" : "s"...} 
     singleLine = singleLine.replace(singleLine[i:i+1],choices.get(singleLine[i],singleLine[i]))

，並就解決了。

來源

2016-06-08 08:11:45 abidinberkay

strstr實例是不可變的，因此str.replace()不能就地操作，而是返回結果。

但是don't do things the hard way。

>>> import unidecode 
>>> unidecode.unidecode('İ,Ö,Ü,Ş,Ç,Ğ') 
'I,O,U,S,C,G'

來源

2016-06-08 07:35:35

謝謝。我在下面的帖子中解答了這個問題，這要感謝你的第一句話評論。 – abidinberkay

Python3將非英文字符轉換爲英文字符

回答

相關問題