2014-07-08 30 views
0

我正在python中創建一個應用程序,它可以在Python中解析來自yr.no的天氣數據。它可以正常使用ASCII字符串,但在使用unicode時失敗。Unicode和urllib.open

def GetYRNOWeatherData(country, province, place): 

    #Parse the XML file 

    wtree = ET.parse(urllib.urlopen("http://www.yr.no/place/" + string.replace(country, ' ', '_').encode('utf-8') + "/" + string.replace(province, ' ', '_').encode('utf-8') + "/" + string.replace(place, ' ', '_').encode('utf-8') + "/forecast.xml")) 

例如,當我嘗試

GetYRNOWeatherData("France", "Île-de-France", "Paris") 

我得到這個錯誤

'charmap' codec can't encode character u'\xce' in position 0: character maps to <undefined> 

難道urllib的不處理Unicode的很好?由於我使用的Tkinter作爲一個前端到這個功能,將是這個問題的來源(是否Tkinter的輸入控件處理Unicode呢?)

回答

1

您可以通過保持每一個字符串作爲unicode的權利,直到你處理這個實際上使urllib.urlopen要求,此時您encodeutf-8

#!/usr/bin/python 
# -*- coding: utf-8 -*- 

# This import makes all literal strings in the file default to 
# type 'unicode' rather than type 'str'. You don't need to use this, 
# but you'd need to do u"France" instead of just "France" below, and 
# everywhere else you have a string literal. 
from __future__ import unicode_literals 

import urllib 
import xml.etree.ElementTree as ET 

def do_format(*args): 
    ret = [] 
    for arg in args: 
     ret.append(arg.replace(" ", "_")) 
    return ret 


def GetYRNOWeatherData(country, province, place): 
    country, province, place = do_format(country, province, place) 
    url = "http://www.yr.no/place/{}/{}/{}/forecast.xml".format(country, province, place) 
    wtree = ET.parse(urllib.urlopen(url.encode('utf-8'))) 
    return wtree 


if __name__ == "__main__": 
    GetYRNOWeatherData("France", "Île-de-France", "Paris") 
+0

@Igor對不起,我已經創建'url'時逆轉province'和'place'的'位置。它應該正常工作,如果你將它們交換回來(我只是在編輯答案時這樣做)。 – dano

+0

是的,我也注意到了。謝謝 – Igor