我想下載從搜索結果下載第一個pdb文件(下載鏈接給出以下名稱)。我使用蟒蛇,硒和美麗。直到現在我已經開發了代碼。使用python beautifulsoup和硒下載文件
import urllib2
from BeautifulSoup import BeautifulSoup
from selenium import webdriver
uni_id = "P22216"
# set parameters
download_dir = "/home/home/Desktop/"
url = "http://www.rcsb.org/pdb/search/smart.do?smartComparator=and&smartSearchSubtype_0=UpAccessionIdQuery&target=Current&accessionIdList_0=%s" % uni_id
print "url - ", url
# opening the url
text = urllib2.urlopen(url).read();
#print "text : ", text
soup = BeautifulSoup(text);
#print soup
print
table = soup.find("table", {"class":"queryBlue"})
#print "table : ", table
status = 0
rows = table.findAll('tr')
for tr in rows:
try:
cols = tr.findAll('td')
if cols:
link = cols[1].find('a').get('href')
print "link : ", link
if link:
if status==1:
main_url = "http://www.rcsb.org" + link
print "main_url-----", main_url
status = False
browser.click(main_url)
status+=1
except:
pass
我正在變成無。
如何下載搜索列表中的第一個文件? (即2YGV在這種情況下)
Download link is : /pdb/protein/P32447
爲我工作。獲取'/pdb/explore/explore.do?structureId = 2YGV'。什麼問題?你不能下載它? – ton1c
我也有,但如何下載該文件。 dat我的問題 – sam