0
我用下面的代碼來獲取用戶的追隨者的Twitter列表:屏幕抓取Twitter頁面使用Unicode平等比較失敗的Python
import urllib
from BeautifulSoup import BeautifulSoup
#code only looks at one page of followers instead of continuing to all of a user's followers
#decided to only use a small sample
site = "http://mobile.twitter.com/NYTimesKrugman/following"
friends = set()
response = urllib.urlopen(site)
html = response.read()
soup = BeautifulSoup(html)
names = soup.findAll('a', {'href': True})
for name in names:
a = name.renderContents()
b = a.lower()
if ("http://mobile.twitter.com/" + b) == name['href']:
c = str (b)
friends.add(c)
for friend in friends:
print friend
print ("Done!")
不過,我得到以下結果:
NYTimeskrugman
nytimesphoto
rasermus
Warning (from warnings module):
File "C:\Users\Public\Documents\Columbia Job\Python Crawler\Twitter Crawler\crawlerversion14.py", line 42
if ("http://mobile.twitter.com/" + b) == name['href']:
UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
amnesty_norge
zynne_
fredssenteret
oljestudentene
solistkoret
....(因此它繼續)
這似乎是我能夠獲得大部分以下的名稱,但我收到了一個有點隨機的錯誤。它並沒有阻止代碼完成,但是......我希望有人能夠告訴我發生了什麼?
此警告是因爲您試圖將一個(非ascii)字符串與一個unicode字符串進行比較,而且它不知道如何將字符串解碼爲ascii。但是,實際上,無論如何,你應該只是使用一個庫來詢問twitter。請參閱https://dev.twitter.com/docs/twitter-libraries#python –
'u「http://mobile.twitter.com/」' – leoluk