從Amazon S3提取文件和元數據的高效方法？

是否有更高效的方式在Amazon S3中列出存儲桶中的文件，併爲每個文件提取元數據？我正在使用AWS PHP SDK。從Amazon S3提取文件和元數據的高效方法？

if ($paths = $s3->get_object_list('my-bucket')) { 
    foreach($paths AS $path) { 
     $meta = $s3->get_object_metadata('my-bucket', $path); 
     echo $path . ' was modified on ' . $meta['LastModified'] . '<br />'; 
    } 
}

此刻我需要運行get_object_list()列出所有的文件，然後爲每個文件來獲取其元數據get_object_metadata()。

如果我的存儲桶中有100個文件，則會發出101個調用來獲取此數據。如果可以在1個電話中進行，這將是一件好事。

E.g：

if ($paths = $s3->get_object_list('my-bucket')) { 
    foreach($paths AS $path) { 
     echo $path['FileName'] . ' was modified on ' . $path['LastModified'] . '<br />'; 
    } 
}

來源

2012-06-12 Ben Sinclair

使用s3對象來存儲'文件'就像使用整個2Gb fs分區來存儲您的Zork圖像。把你所有的元數據放在一個對象中。是的，100個對象需要100個交易。 – starbolin

我結束了使用list_objects功能，掏出我所需要的上次更改元。

全部在一個電話:)

來源

2012-06-20 04:49:48

我知道這是有點老了，但我遇到了這個問題，並解決它，我擴展了AWS SDK使用的批處理功能，這類型的問題。它可以爲大量文件檢索自定義元數據更快。這是我的代碼：

/** 
    * Name: Steves_Amazon_S3 
    * 
    * Extends the AmazonS3 class in order to create a function to 
    * more efficiently retrieve a list of 
    * files and their custom metadata using the CFBatchRequest function. 
    * 
    * 
    */ 
    class Steves_Amazon_S3 extends AmazonS3 { 

     public function get_object_metadata_batch($bucket, $filenames, $opt = null) { 
      $batch = new CFBatchRequest(); 

      foreach ($filenames as $filename) { 

       $this->batch($batch)->get_object_headers($bucket, $filename); // Get content-type 
      } 

      $response = $this->batch($batch)->send(); 

      // Fail if any requests were unsuccessful 
      if (!$response->areOK()) { 
       return false; 
      } 
      foreach ($response as $file) { 
       $temp = array(); 
       $temp['name'] = (string) basename($file->header['_info']['url']); 
       $temp['etag'] = (string) basename($file->header['etag']); 
       $temp['size'] = $this->util->size_readable((integer) basename($file->header['content-length'])); 
       $temp['size_raw'] = basename($file->header['content-length']); 
       $temp['last_modified'] = (string) date("jS M Y H:i:s", strtotime($file->header['last-modified'])); 
       $temp['last_modified_raw'] = strtotime($file->header['last-modified']); 
       @$temp['creator_id'] = (string) $file->header['x-amz-meta-creator']; 
       @$temp['client_view'] = (string) $file->header['x-amz-meta-client-view']; 
       @$temp['user_view'] = (string) $file->header['x-amz-meta-user-view']; 

       $result[] = $temp; 
      } 

      return $result; 
     } 
    }

來源

2012-09-04 14:28:34 Stevo

你需要知道list_objects功能有限制。它不允許加載超過1000個對象，即使max-keys選項將設置爲一些大數字。

要解決這個問題，你需要將數據加載幾次：

private function _getBucketObjects($prefix = '', $booOneLevelOny = false) 
{ 
    $objects = array(); 
    $lastKey = null; 
    do { 
     $args = array(); 
     if (isset($lastKey)) { 
      $args['marker'] = $lastKey; 
     } 

     if (strlen($prefix)) { 
      $args['prefix'] = $prefix; 
     } 

     if($booOneLevelOny) { 
      $args['delimiter'] = '/'; 
     } 

     $res = $this->_client->list_objects($this->_bucket, $args); 
     if (!$res->isOK()) { 
      return null; 
     } 

     foreach ($res->body->Contents as $object) { 
      $objects[] = $object; 
      $lastKey = (string)$object->Key; 
     } 
     $isTruncated = (string)$res->body->IsTruncated; 
     unset($res); 
    } while ($isTruncated == 'true'); 

    return $objects; 
}

至於結果 - 你有對象的完整列表。

如果您有一些自定義標題，該怎麼辦？他們將不會通過list_objects函數返回。在這種情況下，這將有所幫助：

foreach (array_chunk($arrObjects, 1000) as $object_set) { 
    $batch = new CFBatchRequest(); 
    foreach ($object_set as $object) { 
     if(!$this->isFolder((string)$object->Key)) { 
      $this->_client->batch($batch)->get_object_headers($this->_bucket, $this->preparePath((string)$object->Key)); 
     } 
    } 

    $response = $this->_client->batch($batch)->send(); 

    if ($response->areOK()) { 
     foreach ($response as $arrHeaderInfo) { 
      $arrHeaders[] = $arrHeaderInfo->header; 
     } 
    } 
    unset($batch, $response); 
}

來源

2012-12-21 10:15:29 Andron

這正是我所需要的。非常感謝！ – CoreDumpError

@CoreDumpError歡迎您:)我喜歡這個網站，因爲問題很有趣，而且答案可能非常不同且有用！ – Andron

從Amazon S3提取文件和元數據的高效方法？

回答

相關問題