2013-02-08 34 views
0

我在查詢Twitter API並接收utf-8編碼的答案。現在我想用format()函數將這些答案保存在一個字符串中。這是我到目前爲止(我已經嘗試了很多替代品)。無法在字符串中保存utf-8編碼的東西

for user in userInfos: 
    tName = user["name"] if user["name"] is not None else "" 
    tLocation = user["location"] if user["location"] is not None else "" 
    tProfileImageUrl = user["profile_image_url"] if user["profile_image_url"] is not None else "" 
    tCreatedAt = user["created_at"] 
    tFavouritesCount = user["favourites_count"] 
    tUrl = user["url"] if user["url"] is not None else "" 
    tId = user["id"] 
    tProtected = user["protected"] 
    tFollowerCount = user["followers_count"] 
    tLanguage = user["lang"] 
    tVerified = user["verified"] 
    tGeoEnabled = user["geo_enabled"] 
    tTimeZone = user["time_zone"] if user["time_zone"] is not None else "" 
    tFriendsCount = user["friends_count"] 
    tStatusesCount = user["statuses_count"] 
    tScreenName = user["screen_name"] 

    # Custom characteristics 
    age = utl.get_age_in_years(birthdayDict[str(tId)]) 

    # Follower-friend-ratio 
    if tFriendsCount > 0: 
     foRatio = float(tFollowerCount)/float(tFriendsCount) 
    else: 
     foRatio = "" 

    # Age of account in weeks 
    numWeeks = utl.get_age_in_weeks(tCreatedAt) 

    # Tweets per time 
    tweetsPerWeek = float(tStatusesCount)/numWeeks 
    tweetsPerDay = tweetsPerWeek/7.0 

    in_users.remove(str(tId)) 

    outputList = [str(tName), 
        str(tScreenName), 
        str(tProfileImageUrl), 
        str(tLocation), 
        str(tCreatedAt), 
        str(tUrl), 
        str(age), 
        str(tStatusesCount), 
        str(tFollowerCount), 
        str(tFriendsCount), 
        str(tFavouritesCount), 
        str(foRatio), 
        str(tLanguage), 
        str(tVerified), 
        str(tGeoEnabled), 
        str(tTimeZone), 
        str(tProtected), 
        str(numWeeks), 
        str(tweetsPerWeek), 
        str(tweetsPerDay)] 

    pprint.pprint(outputList) 
    fOut.write("{}{}{}{}{}{}{}\n".format(twitterUsers[str(tId)], outputDelimiter, outputDelimiter.join(outputList), outputDelimiter, utl.get_date(), outputDelimiter, utl.get_time())) 

STR(TNAME),STR(tLocation)等時TNAME/tLocation包含的東西給我的錯誤,如\ XE4

ERROR:__main__:'ascii' codec can't encode character u'\xe4' in position 10: ordinal not in range(128) 
Traceback (most recent call last): 
    File "../code/userinfo_extraction_old.py", line 167, in <module> 
    outputList = [str(tName), 
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 10: ordinal not in range(128) 

我試圖理解它是如何工作的,但我無法弄清楚這裏有什麼問題。我也嘗試使用unicode()而不是str()...沒有機會。

+0

...而你正在運行Python 2.something? – 2013-02-08 11:55:50

+0

是啊,Python 2.7版,忘了提的是,對不起。 – wnstnsmth 2013-02-08 11:57:59

+0

try str = str.decode('utf-8') – 2013-02-08 11:59:24

回答

1

要將unicode數據轉換爲str,您需要指定編碼。使用tName.encode('utf8')

您可能需要Python和Unicode的讀了起來:

+0

非常感謝。是的,我以前可能會閱讀其中的一兩個文檔,但由於該主題非常無聊,因此我總會在一週後忘記一半的內容......但是,無論如何感謝鏈接。 – wnstnsmth 2013-02-08 12:09:34