嘗試從圖片url下載圖片，但獲取html代替

類似於Try to scrape image from image url (using python urllib) but get html instead，但該解決方案對我無效。嘗試從圖片url下載圖片，但獲取html代替

from BeautifulSoup import BeautifulSoup 
import urllib2 
import requests 

img_url='http://7-themes.com/data_images/out/79/7041933-beautiful-backgrounds-wallpaper.jpg' 

r = requests.get(img_url, allow_redirects=False) 

headers = {} 
headers['Referer'] = r.headers['location'] 

r = requests.get(img_url, headers=headers) 
with open('7041933-beautiful-backgrounds-wallpaper.jpg', 'wb') as fh: 
    fh.write(r.content)

下載的文件仍然是一個html頁面，而不是一個圖像。

來源

2016-09-27 alec.tu

因爲這個網站有重定向機制，所以如果你直接點擊資源，它會將你重定向到HTML頁面。所以從代碼當你請求這個圖像資源服務器重定向到HTML頁面，然後我們得到這個HTML文件。不是圖像文件。 –

所以這個網站沒有解決方案？ –

通常解決方案是複製瀏覽器的功能。因此，啓動Chrome，打開開發人員工具，切換到網絡選項卡。然後加載託管該圖像的頁面。通常會發生的情況是，在HTML頁面上創建了一些cookie（或其他HTTP製品），這些cookie會隨請求發送給您的圖像。因此，請查看瀏覽器對圖像的請求，並查看與它一起發送了哪些標頭和Cookie。然後查看其餘的流量，看看它們來自哪裏。 – GregHNZ

您的推薦人沒有正確設置。我已經硬編碼的引用，它工作正常

from BeautifulSoup import BeautifulSoup 
import urllib2 
import requests 

img_url='http://7-themes.com/data_images/out/79/7041933-beautiful-backgrounds-wallpaper.jpg' 

r = requests.get(img_url, allow_redirects=False) 

headers = {} 
headers['Referer'] = 'http://7-themes.com/7041933-beautiful-backgrounds-wallpaper.html' 

r = requests.get(img_url, headers=headers, allow_redirects=False) 
with open('7041933-beautiful-backgrounds-wallpaper.jpg', 'wb') as fh: 
    fh.write(r.content)

來源

2016-09-27 05:09:48

是。我發現我的根本原因是'refer'字段，但沒有必要做兩個http請求。 –

我發現在我的代碼是在頭指領域的根本原因仍然是一個HTML，不形象。

因此，我將參考字段更改爲img_url，並且這可以工作。

from BeautifulSoup import BeautifulSoup 
import urllib2 
import urllib 
import requests 

img_url='http://7-themes.com/data_images/out/79/7041933-beautiful-backgrounds-wallpaper.jpg' 

headers = {} 
headers['Referer'] = img_url 

r = requests.get(img_url, headers=headers) 

with open('7041933-beautiful-backgrounds-wallpaper.jpg', 'wb') as fh: 
    fh.write(r.content)

來源

2016-09-27 05:10:42

嘗試從圖片url下載圖片，但獲取html代替

回答

相關問題