我想使用urllib.request.urlretrieve與多處理模塊一起下載一些文件並對它們進行一些處理。然而,每一個我嘗試運行我的程序時,它給我的錯誤:python3 URLError未知的url類型http
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "./thumb.py", line 13, in download_and_convert
filename, headers = urlretrieve(url)
File "/usr/lib/python3.4/urllib/request.py", line 186, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 463, in open
response = self._open(req, data)
File "/usr/lib/python3.4/urllib/request.py", line 486, in _open
'unknown_open', req)
File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
result = func(*args)
File "/usr/lib/python3.4/urllib/request.py", line 1252, in unknown_open
raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: http>
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./thumb.py", line 27, in <module>
pool.map(download_and_convert, enumerate(csvr))
File "/usr/lib/python3.4/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.4/multiprocessing/pool.py", line 599, in get
raise self._value
urllib.error.URLError: <urlopen error unknown url type: http>
,它似乎被噎死的URL是http://phytoimages.siu.edu/users/vitt/10_27_06_2/Equisetumarvense.JPG
。這裏是我的代碼:
#!/usr/bin/env python3
from subprocess import Popen
from sys import argv, stdin
import csv
from multiprocessing import Pool
from urllib.request import urlretrieve
def download_and_convert(args):
num, url_list = args
url = url_list[0]
try:
filename, headers = urlretrieve(url)
except:
print(url)
raise
Popen(["convert", filename, "-resize", "250x250",\
str(num)+'.'+url.split('.')[-1]])
if __name__ == "__main__":
csvr = csv.reader(open(argv[1]))
if(len(argv) > 2): nprocs = argv[2]
else: nprocs = None
pool = Pool(processes=nprocs)
pool.map(download_and_convert, enumerate(csvr))
我不知道爲什麼這個錯誤發生。難道是因爲我正在使用多處理?如果任何人都可以幫助我,那將非常感激。
編輯:這是它嘗試處理的第一個url,如果我更改它並不會更改錯誤。
你說,「它似乎窒息的網址是......」。你能證實嗎?我在您的except塊中看到了一個「print(url)」,但在您的問題中我看不到這個輸出。如果你在一個單獨的文件中隔離包含該URL的行,你能重現錯誤嗎?你可以在你的問題中發佈該行嗎? – larsks
是的,那就是從外部打印輸出的行。 – Aereaux
我還應該提到,這是我處理的文件中的第一個URL,如果我刪除它,會給我提供與下一個url相同的錯誤。 – Aereaux