該頁面使用JavaScript來創建的網址:
<select name="tick">
<option value="TOCOMprice_20121122.csv">Nov 22, 2012</option>
<option value="TOCOMprice_20121121.csv">Nov 21, 2012</option>
<option value="TOCOMprice_20121120.csv">Nov 20, 2012</option>
<option value="TOCOMprice_20121119.csv">Nov 19, 2012</option>
<option value="TOCOMprice_20121116.csv">Nov 16, 2012</option>
</select>
<input type="button" onClick="location.href='/data/tick/' + document.form.tick.value;"
value="Download" style="width:7em;" />
它結合了路徑,瀏覽器將使用逆水現場。所以每個網址是:
http://www.tocom.or.jp + /data/tick/ + TOCOMprice_*yearmonthday*.csv
從外觀上看,數據只涵蓋工作日。
這些都是很容易湊齊到自動化網址:
import requests
from datetime import datetime, timedelta
start = datetime.now() - timedelta(days=1)
base = 'http://www.tocom.or.jp/data/tick/TOCOMprice_'
next = start
for i in range(5):
r = requests.get(base + next.strftime('%Y%m%d') + '.csv')
# Save r.content somewhere
next += timedelta(days=1)
while next.weekday() >= 5: # Sat = 5, Sun = 6
next += timedelta(days=1)
我用requests
它更容易使用的API,但如果你願意的話,你可以使用urllib2
完成這個任務了。