使用s3cmd並行上傳文件到s3

我在服務器上有一大堆文件，我想將它們上傳到S3上。這些文件以.data擴展名存儲，但實際上它們只是一堆jpegs，png，zip或pdf。使用s3cmd並行上傳文件到s3

我已經寫了一個簡短的腳本，它可以找到mime類型並將它們上載到S3上，並且可以工作，但速度很慢。有沒有辦法使用gnu parallel進行下面的運行？

#!/bin/bash 

for n in $(find -name "*.data") 
do 
     data=".data" 
     extension=`file $n | cut -d ' ' -f2 | awk '{print tolower($0)}'` 
     mimetype=`file --mime-type $n | cut -d ' ' -f2` 
     fullpath=`readlink -f $n` 

     changed="${fullpath/.data/.$extension}" 

     filePathWithExtensionChanged=${changed#*internal_data} 

     s3upload="s3cmd put -m $mimetype --acl-public $fullpath s3://tff-xenforo-data"$filePathWithExtensionChanged  

     response=`$s3upload` 
     echo $response 

done

此外，我敢肯定，這個代碼可以大大提高一般:)反饋提示將不勝感激。

來源

2014-11-14 Alan Hollis

並行上傳可以使用Python和博託 – helloV 2014-11-14 19:07:53

點頭，我可以寫在旅途中或其他語言的東西，但我試圖做到這一點「所有在bash」對於沒有特別的理由。 – 2014-11-14 19:09:35

[可能的解決方案]（http://blog.aclarke.eu/moving-copying-lots-of-s3-files-quickly-using-gnu-parallel/） – helloV 2014-11-14 19:13:18

你都清楚地熟練編寫shell，並非常接近一個解決方案：

s3upload_single() { 
    n=$1 
    data=".data" 
    extension=`file $n | cut -d ' ' -f2 | awk '{print tolower($0)}'` 
    mimetype=`file --mime-type $n | cut -d ' ' -f2` 
    fullpath=`readlink -f $n` 

    changed="${fullpath/.data/.$extension}" 

    filePathWithExtensionChanged=${changed#*internal_data} 

    s3upload="s3cmd put -m $mimetype --acl-public $fullpath s3://tff-xenforo-data"$filePathWithExtensionChanged  

    response=`$s3upload` 
    echo $response 
} 
export -f s3upload_single 
find -name "*.data" | parallel s3upload_single

來源

2014-11-14 22:25:51

太棒了謝謝！如果我正在閱讀這個權利，是否會運行所有'find - name「* .data」'返回的文件，並行運行每一個文件？如果真的很酷，但是我假設它會在'find -name「* .data」'返回說出uhh 80k文件 – 2014-11-15 12:35:06

幾乎：'parallel'默認爲每個cpu核心一個進程。如果你想盡可能多的使用'parallel -j0'。這仍然不會並行運行80k，但在沒有更多文件句柄或進程剩下時停止產生更多。 – 2014-11-15 20:21:38

謝謝！ :)我從這裏學到了很多:) – 2014-11-15 20:22:47

你可以使用s3cmd-modified，它允許你把/獲取/同步多個工人在並行

$ git clone https://github.com/pcorliss/s3cmd-modification.git $ cd s3cmd-modification $ python setup.py install $ s3cmd --parallel --workers=4 sync /source/path s3://target/path

來源

2016-06-15 23:50:14

使用aws cli。它支持文件的並行上傳，上傳和下載的速度非常快。

http://docs.aws.amazon.com/cli/latest/reference/s3/

來源

2017-08-22 22:01:21 Hitul

使用s3cmd並行上傳文件到s3

回答

相關問題