使用美麗的湯刮圖像

我想從使用美麗的湯的文章刮圖像。它似乎工作，但我無法打開圖像。每次嘗試從我的桌面訪問圖像時，都會收到文件格式錯誤。任何見解？使用美麗的湯刮圖像

timestamp = time.asctime() 

# Parse HTML of article, aka making soup 
soup = BeautifulSoup(urllib2.urlopen(url).read()) 

# Create a new file to write content to 
txt = open('%s.jpg' % timestamp, "wb") 

# Scrape article main img 
links = soup.find('figure').find_all('img', src=True) 
for link in links: 
    link = link["src"].split("src=")[-1] 
    download_img = urllib2.urlopen(link) 
    txt.write('\n' + "Image(s): " + download_img.read() + '\n' + '\n') 

txt.close()

來源

2014-03-28 user3285763

您正在追加一個新的行和文本到每個圖像的數據的開始，基本上是腐蝕它。

此外，你正在寫每個圖像到同一個文件，再次破壞它們。

把寫入文件的邏輯寫入循環，不要向圖像添加任何額外的數據，它應該工作正常。

# Scrape article main img 
links = soup.find('figure').find_all('img', src=True) 
for link in links: 
    timestamp = time.asctime() 
    txt = open('%s.jpg' % timestamp, "wb") 
    link = link["src"].split("src=")[-1] 
    download_img = urllib2.urlopen(link) 
    txt.write(download_img.read()) 

    txt.close()

來源

2014-03-28 00:42:51

使用美麗的湯刮圖像

回答

相關問題