2014-02-06 80 views
0

Python新手,我試圖自動從Google下載圖片。我想輸入一個關鍵字,然後讓我的程序自動運行並將圖像從Google下載/保存到一個文件夾中,以便在我的計算機上可用。這裏是我的代碼:從Python下載圖片時出現Python錯誤?

import json 
import os 
import time 
import requests 
from PIL import Image 
from StringIO import StringIO 
from requests.exceptions import ConnectionError 


def go(query, path): 

BASE_URL = 'https://ajax.googleapis.com/ajax/services/search/images?'\ 
     'v=1.0&q=' + query + '&start=%d' 

BASE_PATH = os.path.join(path, query) 

if not os.path.exists(BASE_PATH): 
os.makedirs(BASE_PATH) 

start = 0 # Google's start query string parameter for pagination. 
while start < 60: # Google will only return a max of 56 results. 
r = requests.get(BASE_URL % start) 
for image_info in json.loads(r.text)['responseData']['results']: 
    url = image_info['unescapedUrl'] 
    try: 
    image_r = requests.get(url) 
    except ConnectionError, e: 
    print 'could not download %s' % url 
    continue 

    # Remove file-system path characters from name. 
    title = image_info['titleNoFormatting'].replace('/', '').replace('\\', '') 

    file = open(os.path.join(BASE_PATH, '%s.jpg') % title, 'w') 
    try: 
    Image.open(StringIO(image_r.content)).save(file, 'JPEG') 
    except IOError, e: 
    # Throw away some gifs 
    print 'could not save %s' % url 
    continue 
    finally: 
    file.close() 

print start 
start += 4 # 4 images per page. 


time.sleep(1.5) 

示例使用

去( '憤怒的人臉', 'mydirectory中')

但我不斷收到錯誤說:

file = open(os.path.join(BASE_PATH, '%s.jpg') % title, 'w') 
IOError: [Errno 22] invalid mode ('w') or 
filename: u'myDirectory\\landscape\\Nature - Landscapes - Views - Desktop Wallpapers | MIRIADNA..jpg' 

怎麼辦我需要解決這個問題嗎?請幫忙!對此,我真的非常感激。

回答

1
filename: u'... - Desktop Wallpapers | MIRIADNA..jpg' 
            ^This is a problem 

Windows不允許在文件名中使用管道字符(|)。

http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx

以下保留字符:

  • <(小於)
  • >(大於)
  • :(冒號)
  • 「(雙報價)
  • /(正斜槓)
  • \(反斜槓)
  • | (豎條或管道)
  • ? (問號)
  • *(星號)

在你的情況,保留字符出現在您下載並隨後使用您的文件名圖片的標題。你可以很容易地去掉這些字符,例如:

title = ''.join('%s' % lett for lett in [let for let in title if let not in '<>:"/\|?*']) 
+0

但是沒有管道字符? – user3105664

+1

@ user3105664是的,'| miranda.jpg' – TankorSmash

+0

但是在代碼本身中,我沒有包含任何管道字符。當我運行該程序時,它返回了一個錯誤 – user3105664