2017-07-04 101 views
1

我試圖將大小爲1GB的文件上傳到Amazon Glacier。有些任意的,我決定把它分解成32mb的部分,然後串行上傳。分段上傳到Amazon Glacier:內容範圍與內容長度不兼容

import math 
import boto3 
from botocore.utils import calculate_tree_hash 

client = boto3.client('glacier') 
vault_name = 'my-vault' 
size = 1073745600 # in bytes 
size_mb = size/(2**20) # Convert to megabytes for readability 
local_file = 'filename' 

multi_up = client.initiate_multipart_upload(vaultName=vault_name, 
             archiveDescription=local_file, 
             partSize=str(2**25)) # 32 mb in bytes 
parts = math.floor(size_mb/32) 
with open("/Users/alexchase/Desktop/{}".format(local_file), 'rb') as upload: 
    for p in range(parts): 
     # Calculate lower and upper bounds for the byte ranges. The last range 
     # is bigger than the ones that come before. 
     lower = (p * (2**25)) 
     upper = (((p + 1) * (2**25)) - 1) if (p + 1 < parts) else (size) 
     up_part = client.upload_multipart_part(vaultName=vault_name, 
              uploadId=multi_up['uploadId'], 
              range='bytes {}-{}/*'.format(lower, upper), 
              body=upload) 
checksum = calculate_tree_hash(upload) 
complete_up = client.complete_multipart_upload(archiveSize=str(size), 
               checksum=checksum, 
               uploadId=multi_up['uploadId'], 
               vaultName=vault_name) 

這會產生有關第一個字節範圍的錯誤。

--------------------------------------------------------------------------- 
InvalidParameterValueException   Traceback (most recent call last) 
<ipython-input-2-9dd3ac986601> in <module>() 
    93       uploadId=multi_up['uploadId'], 
    94       range='bytes {}-{}/*'.format(lower, upper), 
---> 95       body=upload) 
    96      upload_info.append(up_part) 
    97     checksum = calculate_tree_hash(upload) 

~/anaconda/lib/python3.5/site-packages/botocore/client.py in _api_call(self, *args, **kwargs) 
    251      "%s() only accepts keyword arguments." % py_operation_name) 
    252    # The "self" in this scope is referring to the BaseClient. 
--> 253    return self._make_api_call(operation_name, kwargs) 
    254 
    255   _api_call.__name__ = str(py_operation_name) 

~/anaconda/lib/python3.5/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params) 
    555    error_code = parsed_response.get("Error", {}).get("Code") 
    556    error_class = self.exceptions.from_code(error_code) 
--> 557    raise error_class(parsed_response, operation_name) 
    558   else: 
    559    return parsed_response 

InvalidParameterValueException: An error occurred (InvalidParameterValueException) when calling the UploadMultipartPart operation: 
Content-Range: bytes 0-33554431/* is incompatible with Content-Length: 1073745600 

任何人都可以看到我做錯了什麼?

回答

0
Content-Range: bytes 0-33554431/* is incompatible with Content-Length: 1073745600 

你告訴你要發送的第32 MIB的API,但你實際發送(建議發送)整個文件,因爲body=uploadupload不只是第一部分,這是整個文件。 Content-Length是指這部分上傳的大小,應該是33554432(32 MiB)。

docs誠然曖昧...

body(字節或可搜索文件狀物體) - 數據上傳。

......但是「數據上傳」似乎只涉及這部分數據,儘管單詞「可搜索」。

1

@ Michael-sqlbot非常正確,Content-Range的問題在於我傳遞的是整個文件而不是部分文件。我通過使用read()方法解決了這個問題,但後來我發現了一個單獨的問題,即(根據docs),最後的部分必須與前面的部分相同或更小。這意味着使用math.ceil()而不是math.floor()來定義零件的數量。

工作代碼爲:

import math 
import boto3 
from botocore.utils import calculate_tree_hash 

client = boto3.client('glacier') 
vault_name = 'my-vault' 
size = 1073745600 # in bytes 
size_mb = size/(2**20) # Convert to megabytes for readability 
local_file = 'filename' 
partSize=(2**25) 

multi_up = client.initiate_multipart_upload(vaultName=vault_name, 
             archiveDescription=local_file, 
             partSize=str(partSize)) # 32 mb in bytes 
parts = math.ceil(size_mb/32) # The number of <=32mb parts we need 
with open("/Users/alexchase/Desktop/{}".format(local_file), 'rb') as upload: 
    for p in range(parts): 
     # Calculate lower and upper bounds for the byte ranges. The last range 
     # is now smaller than the ones that come before. 
     lower = (p * (partSize)) 
     upper = (((p + 1) * (partSize)) - 1) if (p + 1 < parts) else (size-1) 
     read_size = upper-lower+1 
     file_part = upload.read(read_size) 
     up_part = client.upload_multipart_part(vaultName=vault_name, 
              uploadId=multi_up['uploadId'], 
              range='bytes {}-{}/*'.format(lower, upper), 
              body=file_part) 
checksum = calculate_tree_hash(upload) 
complete_up = client.complete_multipart_upload(archiveSize=str(size), 
               checksum=checksum, 
               uploadId=multi_up['uploadId'], 
               vaultName=vault_name)