2014-09-04 128 views
1

我使用YouTube的谷歌V3 API:獲取所有YouTube視頻(有些視頻丟失)

$url = 'https://www.googleapis.com/youtube/v3/search?part=id&channelId=' . $channelID . '&maxResults=50&order=date&key=' . $API_key; 

我已經成立了一個腳本,應該給我從給定的所有視頻頻道ID。對於某些頻道,我獲得所有視頻,有些視頻丟失了(與直接在YouTube上顯示的視頻數量相比),而對於更大的頻道,我獲得最大視頻。儘管還有更多的488個視頻的結果。

pageToken是一件很奇怪的事情。例如,一個頻道有955個視頻。我有18頁,每頁50個項目(這將是900個視頻)。其中一些是播放列表,但如果我減去23個播放列表,我仍然有877個視頻。如果我刪除重複項,我只有488個結果! JSON輸出中的totalResults向我顯示了975個結果!?

這是我的遞歸函數:

function fetchAllVideos($parsed_json){ 
    $foundIds = array(); 
    if($parsed_json != ''){ 
     $foundIds = getVideoIds($parsed_json); 
     $nextPageToken = getNextPageToken($parsed_json); 
     $prevPageToken = getPrevPageToken($parsed_json); 

     if($nextPageToken != ''){ 
      $new_parsed_json = getNextPage($nextPageToken); 
      $foundIds = array_merge($foundIds, fetchAllVideos($new_parsed_json)); 
     } 
     if($prevPageToken != ''){ 
      $new_parsed_json = getNextPage($prevPageToken); 
      $foundIds = array_merge($foundIds, fetchAllVideos($new_parsed_json)); 
     } 
    } 

    return $foundIds; 
} 

$videoIds = fetchAllVideos($parsed_json);$parsed_json調用它是從我獲取第一URL的結果。你能在這裏看到一個錯誤嗎?

是否有人知道視頻數量是如何計算的,它們直接顯示在YouTube上?有沒有人設法獲得與Youtube中的號碼相對應的完整列表?

回答

2

https://gdata.youtube.com/feeds/api/users/USERNAME_HERE/uploads?max-results=50&alt=json&start-index=1沒有辦法。這是一個JSON提要,你必須循環,直到你得到少於50個結果。

編輯:

這應該是我使用的腳本:

ini_set('max_execution_time', 900); 

function getVideos($channel){ 
    $ids = array(); 
    $start_index = 1; 
    $still_have_results = true; 

    if($channel == ""){ 
     return false; 
    } 

    $url = 'https://gdata.youtube.com/feeds/api/users/' . $channel . '/uploads?max-results=50&alt=json&start-index=' . $start_index; 
    $json = file_get_contents($url); 
    $obj = json_decode($json); 

    while($still_have_results){ 
     foreach($obj->feed->entry as $video){ 
      $video_url = $video->id->{'$t'}; 
      $last_pos = strrpos($video_url, '/'); 
      $video_id = substr($video_url, $last_pos+1, strlen($video_url) - $last_pos); 
      array_push($ids, $video_id); 
     } 
     $number_of_items = count($obj->feed->entry); 
     $start_index += count($obj->feed->entry); 
     if($number_of_items < 50) { 
      $still_have_results = false; 
     } 

     $url = 'https://gdata.youtube.com/feeds/api/users/' . $channel . '/uploads?max-results=50&alt=json&start-index=' . $start_index; 
     $json = file_get_contents($url); 
     $obj = json_decode($json); 
    } 

    return $ids;  
} 

$videoIds = getVideos('youtube'); 
echo '<pre>'; 
print_r($videoIds); 
echo '</pre>'; 

現在,我做了一個試驗,但我沒有收集到的視頻100%。儘管如此,我想出了最好的選擇。

+0

我給你upvotes,但你應該張貼最後的反正。你不知道什麼時候對某人有用。 – Random 2015-02-24 20:56:31

+0

@Random:現在我添加了我使用的腳本。 – testing 2015-02-24 22:10:43

1

此腳本一次選擇60天,並檢索結果,然後將其添加到現有數據數組中。通過這樣做,對允許多少個視頻沒有任何限制,但可能需要一些時間才能通過幾千個視頻來拖拽更大的YouTube頻道。確保你設置了API_KEY,時區,用戶名,開始日期(應該在頻道上的第一個視頻之前開始)和句點(默認設置爲60 * 60 * 24 * 60,這是60秒,這將需要如果視頻的頻率在60天內高於約50,則會降低)(5184000秒)。

*所有這些都在腳本中進行了評論。

date_default_timezone_set("TIMEZONE"); 

//youtube api key 
$API_KEY = "YOUR API KEY"; 

function search($searchTerm,$url){ 
    $url = $url . urlencode($searchTerm); 

    $result = file_get_contents($url); 

    if($result !== false){ 
     return json_decode($result, true); 
    } 

    return false; 
} 

function get_user_channel_id($user){ 
    global $API_KEY; 
    $url = 'https://www.googleapis.com/youtube/v3/channels?key=' . $API_KEY . '&part=id&forUsername='; 
    return search($user,$url)['items'][0]['id']; 
} 

function push_data($searchResults){ 
    global $data; 
    foreach($searchResults['items'] as $item){ 
     $data[] = $item; 
    } 
    return $data; 
} 

function get_url_for_utc_period($channelId, $utc){ 
    //get the API_KEY 
    global $API_KEY; 
    //youtube specifies the DateTime to be formatted as RFC 3339 formatted date-time value (1970-01-01T00:00:00Z) 
    $publishedAfter = date("Y-m-d\TH:i:sP",strval($utc)); 
    //within a 60 day period 
    $publishedBefore_ = $utc + (60 * 60 * 24 * 60); 
    $publishedBefore = date("Y-m-d\TH:i:sP",$publishedBefore_); 
    //develop the URL with the API_KEY, channelId, and the time period specified by publishedBefore & publishedAfter 
    $url = 'https://www.googleapis.com/youtube/v3/search?part=snippet&type=video&key=' . $API_KEY . '&maxResults=50&channelId=' . $channelId . '&publishedAfter=' . urlencode($publishedAfter) . '&publishedBefore=' . urlencode($publishedBefore); 

    return array("url"=>$url,"utc"=>$publishedBefore_); 
} 
//the date that the loop will begin with, have this just before the first videos on the channel. 
//this is just an example date 
$start_date = "2013-1-1"; 
$utc = strtotime($start_date); 
$username = "CHANNEL USERNAME NOT CHANNEL ID"; 
//get the channel id for the username 
$channelId = get_user_channel_id($username); 

while($utc < time()){ 
    $url_utc = get_url_for_utc_period($channelId, $utc); 
    $searchResults = search("", $url_utc['url']); 
    $data = push_data($searchResults); 
    $utc += 60 * 60 * 24 * 60; 
} 
print "<pre>"; 
print_r($data); 
print "</pre>"; 

//check that all of the videos have been accounted for (cross-reference this with what it says on their youtube channel) 
print count($data);