如何使用python

代碼用於記錄在登錄網站後觸發.txt文件的下載：如何使用python

import requests 
from bs4 import BeautifulSoup as bs 
import urllib 

payload = {  
    'email' : '[email protected]', 
    'password' : 'xxx' 
} 

with requests.Session() as s: 
    m = s.get('https://www.free-ebooks.net',headers={'User-agent': 'Mozilla/5.0'}) 
    t = s.post('https://www.free-ebooks.net',data = payload) 
    r = s.get('https://www.free-ebooks.net/ebook/The-Best-Scandal-Ever') 
    print r.content

從print r.content輸出，我覺得我的登錄是用於成功
代碼觸發下載：

<<<code same as above>>> 
with requests.Session() as s: 
    m = s.get('https://www.free-ebooks.net',headers={'User-agent': 'Mozilla/5.0'}) 
    t = s.post('https://www.free-ebooks.net',data = payload) 
    r = s.get('https://www.free-ebooks.net/ebook/The-Best-Scandal-Ever') 
    urllib.urlretrieve("https://www.free-ebooks.net/ebook/The-Best-Scandal-Ever/txt", "myfile007.pdf")

在我的輸出pdf我得到的是源代碼，而不是pdf的原始內容。
我有我應該使用已經開始session.But的情況下不知道如何實現它的感覺。
ANY1？

來源

2014-10-06 dreamer

你是如何確認登錄成功？你可以在's.cookies'中看到會話ID嗎？增加@ falsetru的答案，這會觸發文本下載實際的URL是'../的產品最佳的醜聞永遠/ TXT？dl''..The-最佳醜聞前所未有/ txt'只是打開了一個網頁在內部觸發實際下載 – srj 2014-10-07 21:03:52

requests和urllib是不同的。他們不共享信息（特別是cookie）。

使用requests「一致。

with requests.Session() as s: 
    m = s.get('https://www.free-ebooks.net', headers={'User-agent': 'Mozilla/5.0'}) 
    t = s.post('https://www.free-ebooks.net', data=payload) 
    r = s.get('https://www.free-ebooks.net/ebook/The-Best-Scandal-Ever') 
    resp = s.get("https://www.free-ebooks.net/ebook/The-Best-Scandal-Ever/txt", 
       stream=True) 
    with open("myfile007.pdf", "wb") as f: 
     f.writelines(resp.iter_content())

來源

2014-10-06 16:05:46 falsetru

不，我試圖上面的代碼，仍的源代碼被存儲到「myfile007」。 – dreamer 2014-10-06 17:31:54

如何使用python

回答

相關問題