2014-09-18 68 views
3

我正在使用搜索api並使用nextpagetoken來對結果進行分頁。 但我無法以這種方式檢索所有結果。我只能從約455000個結果中獲得500個結果。youtube api v3頁標記

這裏的Java代碼來獲取搜索結果:

youtube = new YouTube.Builder(Auth.HTTP_TRANSPORT, Auth.JSON_FACTORY, new HttpRequestInitializer() {public void initialize(HttpRequest request) throws IOException {}   }).setApplicationName("youtube-search").build(); 

YouTube.Search.List search = youtube.search().list("id,snippet"); 
String apiKey = properties.getProperty("youtube.apikey"); 
search.setKey(apiKey); 
search.setType("video"); 
search.setMaxResults(50); 
search.setQ(queryTerm); 
boolean allResultsRead = false; 
while (! allResultsRead){ 
SearchListResponse searchResponse = search.execute(); 
System.out.println("Printed " + searchResponse.getPageInfo().getResultsPerPage() + " out of " + searchResponse.getPageInfo().getTotalResults() + ". Current page token: " + search.getPageToken() + "Next page token: " + searchResponse.getNextPageToken() + ". Prev page token" + searchResponse.getPrevPageToken()); 
if (searchResponse.getNextPageToken() == null) 
{ 
    allResultsRead = true;       
    search = youtube.search().list("id,snippet"); 
    search.setKey(apiKey); 
    search.setType("video"); 
    search.setMaxResults(50); 
} 
else 
{ 
    search.setPageToken(searchResponse.getNextPageToken()); 
}} 

的輸出是通過while循環

Printed 50 out of 455085. Current page token: null Next page token: CDIQAA. Prev page token null 
Printed 50 out of 454983. Current page token: CDIQAA Next page token: CGQQAA. Prev page token CDIQAQ 
Printed 50 out of 455081. Current page token: CGQQAA Next page token: CJYBEAA. Prev page token CGQQAQ 
Printed 50 out of 454981. Current page token: CJYBEAA Next page token: CMgBEAA. Prev page token CJYBEAE 
Printed 50 out of 455081. Current page token: CMgBEAA Next page token: CPoBEAA. Prev page token CMgBEAE 
Printed 50 out of 454981. Current page token: CPoBEAA Next page token: CKwCEAA. Prev page token CPoBEAE 
Printed 50 out of 455081. Current page token: CKwCEAA Next page token: CN4CEAA. Prev page token CKwCEAE 
Printed 50 out of 454980. Current page token: CN4CEAA Next page token: CJADEAA. Prev page token CN4CEAE 
Printed 50 out of 455081. Current page token: CJADEAA Next page token: CMIDEAA. Prev page token CJADEAE 
Printed 50 out of 455081. Current page token: CMIDEAA Next page token: null. Prev page token CMIDEAE 

10後的迭代,它離開,因爲下一個頁面標記爲空。

我是Yotube API的新手,不確定我在這裏做錯了什麼。我有兩個問題: 1.我如何獲得所有結果? 2.爲什麼頁面3的前一個頁面標記與頁面2的當前標記不一樣?

任何幫助將不勝感激。謝謝!

回答

17

您正在體驗什麼是預期的;使用nextPageToken,你最多隻能得到500個結果。如果你有興趣在這是怎麼圍繞發展,你可以通過這個線程讀取:

https://code.google.com/p/gdata-issues/issues/detail?id=4282

但作爲該線程的總結,它基本上可以歸結爲一個事實,即,有這麼多YouTube上的數據,搜索算法與大多數人認爲它們完全不同。這不僅僅是簡單的數據庫搜索字段中的內容,但是正在處理的令人難以置信的數量的信號正是爲了使結果相關,並且在大約500次結果之後,算法開始失去使結果值得的能力。

有一件事讓我意識到,當YouTube談論搜索時,他們談論的是概率而不是匹配,所以根據您的參數根據您的參數對結果進行排序與您的查詢相關。當你分頁時,你最終會達到一個點,在統計上來說,相關的概率足夠低,以至於在計算上不值得讓結果返回。所以500是決定的極限。 (也請注意,「結果」的數量並不是匹配的近似值,它是潛在匹配的近似值,但是隨着您開始檢索它們,許多可能的匹配將被拋棄,因爲它們不相關所有...所以這個數字並不真正意味着什麼人想象的那樣谷歌搜索方式是一樣的)

你可能想知道爲什麼以這種方式,而不是做更傳統的字符串/數據匹配YouTube的搜索功能。。有這麼多的搜索量,如果他們真正做到爲每一個查詢中的所有數據的完整搜索,你會在一個時間如果沒有更多的等待分鐘。這的確是一個技術奇蹟,如果你想想看,這些算法是如何能夠得到這樣的相關結果爲500強情況下,當他們的預測,概率,這樣的工作。

至於你的第二個問題,頁面標記並不代表一組唯一的結果,而是代表一種算法狀態,因此是指向你的查詢的指針,查詢的進度和方向查詢...所以迭代3,例如,是通過迭代2兩者nextPageToken這個和迭代4 prevPageToken引用,但是這兩個令牌是稍微不同的,這樣他們可以指明他們來自的方向。

0

你可以通過頁面的nextPageToken,並把它作爲參數傳遞給pagetoken

這將顯示NEX頁我寫了vardamp向您展示的頁面標記是不一樣的只是複製該代碼並運行它並確保你已經把API資源文件夾中的同一個文件夾插件

<?php 
    function doit(){if (isset($_GET['q']) && $_GET['maxResults']) { 
     // Call set_include_path() as needed to point to your client library. 
    // require_once ($_SERVER["DOCUMENT_ROOT"].'/API/youtube/google-api-php-client/src/Google_Client.php'); 
    // require_once ($_SERVER["DOCUMENT_ROOT"].'/API/youtube/google-api-php-client/src/contrib/Google_YouTubeService.php'); 
     set_include_path("./google-api-php-client/src"); 
     require_once 'Google_Client.php'; 
     require_once 'contrib/Google_YouTubeService.php'; 
     /* Set $DEVELOPER_KEY to the "API key" value from the "Access" tab of the 
     Google APIs Console <http://code.google.com/apis/console#access> 
     Please ensure that you have enabled the YouTube Data API for your project. */ 
     $DEVELOPER_KEY = 'AIzaSyCgHHDrx5ufQlkXcSc8nm5uqrsNdXizbMs'; 

         // the old one AIzaSyDOkg-u9jnhP-WnzX5WPJyV1sc5QQrtuyc 



    $client = new Google_Client(); 
     $client->setDeveloperKey($DEVELOPER_KEY); 

     $youtube = new Google_YoutubeService($client); 

     try { 
     $searchResponse = $youtube->search->listSearch('id,snippet', array(
      'q' => $_GET['q'], 
      'maxResults' => $_GET['maxResults'], 

    )); 
    var_dump($searchResponse); 


    $searchResponse2 = $youtube->search->listSearch('id,snippet', array(
     'q' => $_GET['q'], 
     'maxResults' => $_GET['maxResults'], 
     'pageToken' => $searchResponse['nextPageToken'], 
    )); 
    var_dump($searchResponse2); 
    exit; 


    $videos = ''; 
    $channels = ''; 
     foreach ($searchResponse['items'] as $searchResult) { 
      switch ($searchResult['id']['kind']) { 
     case 'youtube#video': 

      $videoId =$searchResult['id']['videoId']; 
      $title = $searchResult['snippet']['title']; 
      $publishedAt= $searchResult['snippet']['publishedAt']; 
      $description = $searchResult['snippet']['description']; 
      $iamge_url = $searchResult['snippet'] ['thumbnails']['default']['url']; 
      $image_high = $searchResult['snippet'] ['thumbnails']['high']['url']; 




      echo '<div class="souligne" id="'.$videoId.'"> 

      <div > 
      <a href=http://www.youtube.com/watch?v='.$videoId.' target=_blank" > 
      <img src="'.$iamge_url .'" width ="150px" /> 
      </a> 
      </div> 
      <div class="title">'.$title.'</div> 
      <div class="des"> '.$description.' </div> 
      <a id="'.$videoId.'" onclick="supp(this)" class="linkeda"> 
       + ADD 
      </a>     
      </div>' 
      ; 
      break; 
     } 
    } 
    echo ' </ul></form>'; 

     } catch (Google_ServiceException $e) { 
     $htmlBody .= sprintf('<p>A service error occurred: <code>%s</code></p>', 
      htmlspecialchars($e->getMessage())); 
     } catch (Google_Exception $e) { 
     $htmlBody .= sprintf('<p>An client error occurred: <code>%s</code></p>', 
      htmlspecialchars($e->getMessage())); 
     } 
    }} 
     doit(); 
    ?> 
    <!doctype html> 
    <html> 
     <head> 
     <title>YouTube Search</title> 
    <link href="//www.w3resource.com/includes/bootstrap.css" rel="stylesheet"> 
    <style type="text/css"> 
    body{margin-top: 50px; margin-left: 50px} 
    </style> 
     </head> 
     <body> 
     <form method="GET"> 
     <div> 
     Search Term: <input type="search" id="q" name="q" placeholder="Enter Search Term"> 
     </div> 
     <div> 

     Max Results: <input type="number" id="maxResults" name="maxResults" min="1" max="1000000" step="1" value="25"> 
     </div> 
     <div> 
     page: <input type="number" id="startIndex" name="startIndex" min="1" max="50" step="1" value="2"> 
     </div> 
     <input type="submit" value="Search"> 
    </form> 

<h3>Videos</h3> 
    <ul><?php if(isset($videos))echo $videos; ?></ul> 
    <h3>Channels</h3> 
    <ul><?php if(isset($channels)) echo $channels; ?></ul> 
</body> 
</html> 
3

我看到的是,你還沒有包括的「nextPageToken」在setFields的。

比如:

public class ABC { 
private YouTube youtube; 
private YouTube.Search.List query; 

public static final String KEY = "YOUR API KEY"; 

public YoutubeConnector(Context context) { 
    youtube = new YouTube.Builder(new NetHttpTransport(), new JacksonFactory(), new HttpRequestInitializer() { 
     @Override 
     public void initialize(HttpRequest httpRequest) throws IOException { 
     } 
    }).setApplicationName(context.getString(R.string.app_name)).build(); 

    try { 
     query = youtube.search().list("id,snippet"); 
     query.setMaxResults(Long.parseLong("10")); 
     query.setKey(KEY); 
     query.setType("video"); 
     query.setFields("items(id/videoId,snippet/title,snippet/description,snippet/thumbnails/default/url),nextPageToken"); 
    } catch (IOException e) { 
     Log.d("YC", "Could not initialize: " + e.getMessage()); 
    } 
} 

public List<VideoItem> search(String keywords) { 
    query.setQ(keywords); 
     try { 
      List<VideoItem> items = new ArrayList<VideoItem>(); 
      String nextToken = ""; 
      int i = 0; 
      do { 
       query.setPageToken(nextToken); 
       SearchListResponse response = query.execute(); 
       List<SearchResult> results = response.getItems(); 
       for (SearchResult result : results) { 
        VideoItem item = new VideoItem(); 
        item.setTitle(result.getSnippet().getTitle()); 
         item.setDescription(result.getSnippet().getDescription()); 
        item.setThumbnailURL(result.getSnippet().getThumbnails().getDefault().getUrl()); 
        item.setId(result.getId().getVideoId()); 
        items.add(item); 
       } 
       nextToken = response.getNextPageToken(); 
       i ++; 
       System.out.println("nextToken : "+ nextToken); 
      } while (nextToken != null && i < 20); 

      return items; 
     } catch (IOException e) { 
      Log.d("YC", "Could not search: " + e); 
      return null; 
     } 

} 
} 

我希望這可以幫助您。