Unicode和urllib.open

我正在python中創建一個應用程序，它可以在Python中解析來自yr.no的天氣數據。它可以正常使用ASCII字符串，但在使用unicode時失敗。Unicode和urllib.open

def GetYRNOWeatherData(country, province, place): 

    #Parse the XML file 

    wtree = ET.parse(urllib.urlopen("http://www.yr.no/place/" + string.replace(country, ' ', '_').encode('utf-8') + "/" + string.replace(province, ' ', '_').encode('utf-8') + "/" + string.replace(place, ' ', '_').encode('utf-8') + "/forecast.xml"))

例如，當我嘗試

GetYRNOWeatherData("France", "Île-de-France", "Paris")

我得到這個錯誤

'charmap' codec can't encode character u'\xce' in position 0: character maps to <undefined>

難道urllib的不處理Unicode的很好？由於我使用的Tkinter作爲一個前端到這個功能，將是這個問題的來源（是否Tkinter的輸入控件處理Unicode呢？）

來源

2014-07-08 Igor

您可以通過保持每一個字符串作爲unicode的權利，直到你處理這個實際上使urllib.urlopen要求，此時您encode到utf-8：

#!/usr/bin/python 
# -*- coding: utf-8 -*- 

# This import makes all literal strings in the file default to 
# type 'unicode' rather than type 'str'. You don't need to use this, 
# but you'd need to do u"France" instead of just "France" below, and 
# everywhere else you have a string literal. 
from __future__ import unicode_literals 

import urllib 
import xml.etree.ElementTree as ET 

def do_format(*args): 
    ret = [] 
    for arg in args: 
     ret.append(arg.replace(" ", "_")) 
    return ret 


def GetYRNOWeatherData(country, province, place): 
    country, province, place = do_format(country, province, place) 
    url = "http://www.yr.no/place/{}/{}/{}/forecast.xml".format(country, province, place) 
    wtree = ET.parse(urllib.urlopen(url.encode('utf-8'))) 
    return wtree 


if __name__ == "__main__": 
    GetYRNOWeatherData("France", "Île-de-France", "Paris")

來源

2014-07-08 15:33:57 dano

@Igor對不起，我已經創建'url'時逆轉province'和'place'的'位置。它應該正常工作，如果你將它們交換回來（我只是在編輯答案時這樣做）。 – dano

是的，我也注意到了。謝謝 – Igor

Unicode和urllib.open

回答

相關問題