如何將http響應分成塊？

我正在嘗試使用python創建一個多線程下載程序。假設我有一個鏈接到100MB大小的視頻，我想用5個線程下載它，每個線程同時下載20MB。爲了實現這個目標，我必須將初始響應分成5個部分，代表文件的不同部分（如0-20MB，20-40MB，40-60MB，60-80MB，80-100MB），我搜索並找到了http範圍標題可能有幫助。這裏的示例代碼如何將http響應分成塊？

from urllib.request import urlopen,Request 
url= some video url 
header = {'Range':'bytes=%d-%d' % (5000,10000)} # trying to capture all the bytes in between 5000th and 1000th byte. 
req=Request(url,headers=header) 
res=urlopen(req) 
r=res.read()

但上面的代碼讀取整個視頻，而不是我想要的字節，它顯然是行不通的。那麼有什麼方法可以讀取視頻任意部分的指定範圍的字節，而不是從開始讀取？請嘗試用簡單的詞語來解釋。

來源

2016-09-22 Airbear

如果瓶頸是連接的帶寬，多線程下載器可能不會讓事情變得更快。 – martineau

請參閱關於此主題的[_Byte serving_]（https://en.wikipedia.org/wiki/Byte_serving）維基百科文章。響應的Content-Range頭部會告訴你什麼字節被傳送。 – martineau

是的，我同意你的意見。我只想嘗試一下。 – Airbear

但上述代碼正在讀取整個視頻，而不是我想要的字節I ，它顯然不工作。

的核心問題是默認請求使用其中一次全部向下拉動整個文件的HTTP GET方法。

這可以通過添加request.get_method = lambda : 'HEAD'來解決。這使用HTTP HEAD方法來獲取Content-Length並驗證是否支持範圍請求。

以下是分塊請求的工作示例。只要改變網址您感興趣的網址：

from urllib.request import urlopen, Request 

url = 'http://www.jython.org' # This is an example. Use your own url here. 

n = 5 
request = Request(url) 
request.get_method = lambda : 'HEAD' 
r = urlopen(request) 

# Verify that the server supports Range requests 
assert r.headers.get('Accept-Ranges', '') == 'bytes', 'Range requests not supported' 

# Compute chunk size using a double negation for ceiling division 
total_size = int(r.headers.get('Content-Length')) 
chunk_size = -(-total_size // n) 

# Showing chunked downloads. This should be run in multiple threads. 
chunks = [] 
for i in range(n): 
    start = i * chunk_size 
    end = start + chunk_size - 1 # Bytes ranges are inclusive 
    headers = dict(Range = 'bytes=%d-%d' % (start, end)) 
    request = Request(url, headers=headers) 
    chunk = urlopen(request).read() 
    chunks.append(chunk)

單獨的請求在for循環可以並行使用線程或進程來完成。當在具有多個物理連接到互聯網的環境中運行時，這會提供很好的加速。但是，如果你只有一個物理連接，那很可能是瓶頸，所以並行請求不會像預期的那樣有幫助。

來源

2017-03-26 04:07:41

題外話：'chunk_size = - （ - total_size // n）'的評論非常值錢。 ** _爲什麼_ **是這樣做的？ – martineau

@martineau在上面的註釋中解釋了它。雙重否定計算「天花板分割」。如果* total_size *爲1602，* n *爲4，則您希望* chunk_size *爲401（上限除法）而不是400（地板除法），它不會覆蓋整個數據集（400 * 4 <1602 < = 401 * 4）。 –

如何將http響應分成塊？

回答

相關問題