2013-10-05 389 views
7

我想弄清楚如何從YouTube上下載視頻到本地文件系統。我已經嘗試了幾個軟件包,如vGet,但似乎無法使它工作。任何幫助深表感謝。如何用java下載Youtube視頻

回答

10

我試過vget, https://github.com/axet/vget(已經搬到https://gitlab.com/axet/vget) 它工作正常。 您可以使用maven來設置或從pom文件手動下載依賴項。 依賴性是

  • 的wget(https://github.com/axet/wget

  • 公地-IO-2.4.jar

  • 公地lang3-3.1.jar

  • 的HttpCore-4.3.jar

  • httpclient-4.3.jar

  • 的XStream-1.4.2.jar

編譯JDK6

跑直接下載樣品,

public class DirectDownload { 

    public static void main(String[] args) { 
     try { 
      VGet v = new VGet(new URL("http://www.youtube.com/watch?v=fNU4UNPNeWI"), new File("/")); 
      v.download(); 
     } catch (Exception e) { 
      throw new RuntimeException(e); 
     } 
    } 

} 

檢查了準備工作的例子,來源包括在罐子

zip文件 - 1,89KB https://www.wetransfer.com/downloads/465f7ef8c6a76f79e4cbd7c9f38a608c20131005141332/41c09a86ed8eaa6e61f59282eabda2a120131005141332/4ac689

更新#- 在評論

tldr提到下載時NPE問題;似乎有幾個問題與com.github.axet.vget.vhs.YouTubeParser,因此添加了非侵入代碼來修補它,並使示例工作與以前一樣。因此,只需將最初的YoutubeParser類替換爲最後發佈的類即可。

另外找到另一個現成的工作示例與jar和所有必需的庫,包括在罐子裏的來源(這將在一段時間後自動刪除(將於2014年9月13日刪除) - 包括youtube網址在代碼是隨機的)

zip文件 - 1,69MB(wetransfer.com顯示器1.7MB)

https://www.wetransfer.com/downloads/7b2d9182c9d91577919df3907cfd025620140906080118/a9ded3ba71496d4df9b4d035ac5a1e3920140906080118/6acf46#

A.問題

  1. 在com.github.axet.vget.vhs.YouTubeParser LN229變量qs最不包含在HTTP的結果查詢字符串獲取與WGet執行的時間。這會導致稍後嘗試解析查詢字符串時引發npe

  2. 如果問題1得到解決,那麼在get_video_info返回的網址中找不到sig變量,因此使用Pattern.compile("sig=([^&,]*)")解析不會返回任何值。這會導致連續重試,而無需下載視頻。

B.決議(這些都是暫時貼劑,作爲響應的原始格式和wget行爲不當的原因是未知的)

  1. 調用WGet再次,如果結果查詢字符串是空的,沒有一個WGet.HtmlLoader似乎做的工作。還提供了使用apache httpclient v4調用簡單HTTP GET的方法,在這種情況下,還有一個依賴於apache commons-logging.jar。

加入ln248

 if (qs == null || qs.trim().length() == 0) { 
      qs = WGet.getHtml(url); 

////below is sample code for simple HTTP GET with httpclient v4 
////if used then apache commons-logging.jar is also required 
//   CloseableHttpClient httpclient = HttpClients.createDefault(); 
//   try { 
//    HttpGet httpget = new HttpGet(get); 
// 
//    System.out.println("Executing request " + httpget.getRequestLine()); 
// 
//    // Create a custom response handler 
//    ResponseHandler<String> responseHandler = new ResponseHandler<String>() { 
// 
//     public String handleResponse(
//       final HttpResponse response) throws ClientProtocolException, IOException { 
//      int status = response.getStatusLine().getStatusCode(); 
//      if (status >= 200 && status < 300) { 
//       HttpEntity entity = response.getEntity(); 
//       return entity != null ? EntityUtils.toString(entity) : null; 
//      } else { 
//       throw new ClientProtocolException("Unexpected response status: " + status); 
//      } 
//     } 
// 
//    }; 
//    String responseBody = httpclient.execute(httpget, responseHandler); 
//    qs = responseBody; 
//   } finally { 
//    httpclient.close(); 
//   } 
     } 

2.After看的響應似乎是在某個地方的簽名,所以提高了解析位。因此,如果Pattern.compile("sig=([^&,]*)")的模式執行不返回任何內容,那麼也嘗試使用Pattern.compile("signature%3D([^&,%]*)")。此更改發生在方法extractUrlEncodedVideos中。

String sig = null; 
       { 
        Pattern link = Pattern.compile("signature=([^&,]*)"); 
        Matcher linkMatch = link.matcher(urlString); 
        if (linkMatch.find()) { 
         sig = linkMatch.group(1); 
        } else { 
         link = Pattern.compile("signature%3D([^&,%]*)"); 
         linkMatch = link.matcher(urlString); 
         if (linkMatch.find()) { 
          sig = linkMatch.group(1); 
         } 
        } 
       } 

修改後的com.github.axet.vget.vhs.YouTubeParser文件如下,

package com.github.axet.vget.vhs; 

import java.net.MalformedURLException; 
import java.net.URI; 
import java.net.URISyntaxException; 
import java.net.URL; 
import java.net.URLDecoder; 
import java.util.ArrayList; 
import java.util.HashMap; 
import java.util.List; 
import java.util.Map; 
import java.util.concurrent.atomic.AtomicBoolean; 
import java.util.regex.Matcher; 
import java.util.regex.Pattern; 

import org.apache.commons.lang3.StringEscapeUtils; 
import org.apache.commons.lang3.StringUtils; 
import org.apache.http.NameValuePair; 
import org.apache.http.client.utils.URLEncodedUtils; 

import com.github.axet.vget.info.VGetParser; 
import com.github.axet.vget.info.VideoInfo; 
import com.github.axet.vget.info.VideoInfo.States; 
import com.github.axet.vget.info.VideoInfo.VideoQuality; 
import com.github.axet.wget.WGet; 
import com.github.axet.wget.info.ex.DownloadError; 
import java.io.IOException; 
import org.apache.http.HttpEntity; 
import org.apache.http.HttpResponse; 
import org.apache.http.client.ClientProtocolException; 
import org.apache.http.client.ResponseHandler; 
import org.apache.http.client.methods.HttpGet; 
import org.apache.http.impl.client.CloseableHttpClient; 
import org.apache.http.impl.client.HttpClients; 
import org.apache.http.util.EntityUtils; 

public class YouTubeParser extends VGetParser { 

    public static class VideoUnavailablePlayer extends DownloadError { 

     private static final long serialVersionUID = 10905065542230199L; 

     public VideoUnavailablePlayer() { 
      super("unavailable-player"); 
     } 
    } 

    public static class AgeException extends DownloadError { 

     private static final long serialVersionUID = 1L; 

     public AgeException() { 
      super("Age restriction, account required"); 
     } 
    } 

    public static class PrivateVideoException extends DownloadError { 

     private static final long serialVersionUID = 1L; 

     public PrivateVideoException() { 
      super("Private video"); 
     } 

     public PrivateVideoException(String s) { 
      super(s); 
     } 
    } 

    public static class EmbeddingDisabled extends DownloadError { 

     private static final long serialVersionUID = 1L; 

     public EmbeddingDisabled(String msg) { 
      super(msg); 
     } 
    } 

    public static class VideoDeleted extends DownloadError { 

     private static final long serialVersionUID = 1L; 

     public VideoDeleted(String msg) { 
      super(msg); 
     } 
    } 

    List<VideoDownload> sNextVideoURL = new ArrayList<VideoDownload>(); 

    URL source; 

    public YouTubeParser(URL input) { 
     this.source = input; 
    } 

    public static boolean probe(URL url) { 
     return url.toString().contains("youtube.com"); 
    } 

    void downloadone(VideoInfo info, AtomicBoolean stop, Runnable notify) throws Exception { 
     try { 
      extractEmbedded(info, stop, notify); 
     } catch (EmbeddingDisabled e) { 
      streamCpature(info, stop, notify); 
     } 
    } 

    /** 
    * do not allow to download age restricted videos 
    * 
    * @param info 
    * @param stop 
    * @param notify 
    * @throws Exception 
    */ 
    void streamCpature(final VideoInfo info, final AtomicBoolean stop, final Runnable notify) throws Exception { 
     String html; 
     html = WGet.getHtml(info.getWeb(), new WGet.HtmlLoader() { 
      @Override 
      public void notifyRetry(int delay, Throwable e) { 
       info.setDelay(delay, e); 
       notify.run(); 
      } 

      @Override 
      public void notifyDownloading() { 
       info.setState(States.DOWNLOADING); 
       notify.run(); 
      } 

      @Override 
      public void notifyMoved() { 
       info.setState(States.RETRYING); 
       notify.run(); 
      } 
     }, stop); 
     extractHtmlInfo(info, html, stop, notify); 
     extractIcon(info, html); 
    } 

    /** 
    * Add resolution video for specific youtube link. 
    * 
    * @param url download source url 
    * @throws MalformedURLException 
    */ 
    void addVideo(String itag, String url) throws MalformedURLException { 
     Integer i = Integer.decode(itag); 
     VideoQuality vd = itagMap.get(i); 

     URL u = new URL(url); 

     if (u != null) { 
      sNextVideoURL.add(new VideoDownload(vd, u)); 
     } 
    } 

    // http://en.wikipedia.org/wiki/YouTube#Quality_and_codecs 
    static final Map<Integer, VideoQuality> itagMap = new HashMap<Integer, VideoInfo.VideoQuality>() { 
     private static final long serialVersionUID = -6925194111122038477L; 

     { 
      put(120, VideoQuality.p720); 
      put(102, VideoQuality.p720); 
      put(101, VideoQuality.p360); 
      put(100, VideoQuality.p360); 
      put(85, VideoQuality.p520); 
      put(84, VideoQuality.p720); 
      put(83, VideoQuality.p240); 
      put(82, VideoQuality.p360); 
      put(46, VideoQuality.p1080); 
      put(45, VideoQuality.p720); 
      put(44, VideoQuality.p480); 
      put(43, VideoQuality.p360); 
      put(38, VideoQuality.p3072); 
      put(37, VideoQuality.p1080); 
      put(36, VideoQuality.p240); 
      put(35, VideoQuality.p480); 
      put(34, VideoQuality.p360); 
      put(22, VideoQuality.p720); 
      put(18, VideoQuality.p360); 
      put(17, VideoQuality.p144); 
      put(6, VideoQuality.p270); 
      put(5, VideoQuality.p240); 
     } 
    }; 

    public static String extractId(URL url) { 
     { 
      Pattern u = Pattern.compile("youtube.com/watch?.*v=([^&]*)"); 
      Matcher um = u.matcher(url.toString()); 
      if (um.find()) { 
       return um.group(1); 
      } 
     } 

     { 
      Pattern u = Pattern.compile("youtube.com/v/([^&]*)"); 
      Matcher um = u.matcher(url.toString()); 
      if (um.find()) { 
       return um.group(1); 
      } 
     } 

     return null; 
    } 

    /** 
    * allows to download age restricted videos 
    * 
    * @param info 
    * @param stop 
    * @param notify 
    * @throws Exception 
    */ 
    void extractEmbedded(final VideoInfo info, final AtomicBoolean stop, final Runnable notify) throws Exception { 
     String id = extractId(source); 
     if (id == null) { 
      throw new RuntimeException("unknown url"); 
     } 

     info.setTitle(String.format("http://www.youtube.com/watch?v=%s", id)); 

     String get = String 
       .format("http://www.youtube.com/get_video_info?video_id=%s&el=embedded&ps=default&eurl=", id); 

     URL url = new URL(get); 

     String qs = WGet.getHtml(url, new WGet.HtmlLoader() { 
      @Override 
      public void notifyRetry(int delay, Throwable e) { 
       info.setDelay(delay, e); 
       notify.run(); 
      } 

      @Override 
      public void notifyDownloading() { 
       info.setState(States.DOWNLOADING); 
       notify.run(); 
      } 

      @Override 
      public void notifyMoved() { 
       info.setState(States.RETRYING); 
       notify.run(); 
      } 
     }, stop); 

     if (qs == null || qs.trim().length() == 0) { 
      qs = WGet.getHtml(url); 

////below is sample code for simple HTTP GET with httpclient v4 
////if used then apache commons-logging.jar is also required 
//   CloseableHttpClient httpclient = HttpClients.createDefault(); 
//   try { 
//    HttpGet httpget = new HttpGet(get); 
// 
//    System.out.println("Executing request " + httpget.getRequestLine()); 
// 
//    // Create a custom response handler 
//    ResponseHandler<String> responseHandler = new ResponseHandler<String>() { 
// 
//     public String handleResponse(
//       final HttpResponse response) throws ClientProtocolException, IOException { 
//      int status = response.getStatusLine().getStatusCode(); 
//      if (status >= 200 && status < 300) { 
//       HttpEntity entity = response.getEntity(); 
//       return entity != null ? EntityUtils.toString(entity) : null; 
//      } else { 
//       throw new ClientProtocolException("Unexpected response status: " + status); 
//      } 
//     } 
// 
//    }; 
//    String responseBody = httpclient.execute(httpget, responseHandler); 
//    qs = responseBody; 
//   } finally { 
//    httpclient.close(); 
//   } 
     } 

     Map<String, String> map = getQueryMap(qs); 

     if (map.get("status").equals("fail")) { 
      String r = URLDecoder.decode(map.get("reason"), "UTF-8"); 
      if (map.get("errorcode").equals("150")) { 
       throw new EmbeddingDisabled("error code 150"); 
      } 
      if (map.get("errorcode").equals("100")) { 
       throw new VideoDeleted("error code 100"); 
      } 

      throw new DownloadError(r); 
      // throw new PrivateVideoException(r); 
     } 

     info.setTitle(URLDecoder.decode(map.get("title"), "UTF-8")); 

     // String fmt_list = URLDecoder.decode(map.get("fmt_list"), "UTF-8"); 
     // String[] fmts = fmt_list.split(","); 
     String url_encoded_fmt_stream_map = URLDecoder.decode(map.get("url_encoded_fmt_stream_map"), "UTF-8"); 

     extractUrlEncodedVideos(url_encoded_fmt_stream_map); 

     // 'iurlmaxresæ or 'iurlsd' or 'thumbnail_url' 
     String icon = map.get("thumbnail_url"); 
     icon = URLDecoder.decode(icon, "UTF-8"); 
     info.setIcon(new URL(icon)); 
    } 

    void extractIcon(VideoInfo info, String html) { 
     try { 
      Pattern title = Pattern.compile("itemprop=\"thumbnailUrl\" href=\"(.*)\""); 
      Matcher titleMatch = title.matcher(html); 
      if (titleMatch.find()) { 
       String sline = titleMatch.group(1); 
       sline = StringEscapeUtils.unescapeHtml4(sline); 
       info.setIcon(new URL(sline)); 
      } 
     } catch (RuntimeException e) { 
      throw e; 
     } catch (Exception e) { 
      throw new RuntimeException(e); 
     } 
    } 

    public static Map<String, String> getQueryMap(String qs) { 
     try { 
      qs = qs.trim(); 
      List<NameValuePair> list; 
      list = URLEncodedUtils.parse(new URI(null, null, null, -1, null, qs, null), "UTF-8"); 
      HashMap<String, String> map = new HashMap<String, String>(); 
      for (NameValuePair p : list) { 
       map.put(p.getName(), p.getValue()); 
      } 
      return map; 
     } catch (URISyntaxException e) { 
      throw new RuntimeException(qs, e); 
     } 
    } 

    void extractHtmlInfo(VideoInfo info, String html, AtomicBoolean stop, Runnable notify) throws Exception { 
     { 
      Pattern age = Pattern.compile("(verify_age)"); 
      Matcher ageMatch = age.matcher(html); 
      if (ageMatch.find()) { 
       throw new AgeException(); 
      } 
     } 

     { 
      Pattern age = Pattern.compile("(unavailable-player)"); 
      Matcher ageMatch = age.matcher(html); 
      if (ageMatch.find()) { 
       throw new VideoUnavailablePlayer(); 
      } 
     } 

     { 
      Pattern urlencod = Pattern.compile("\"url_encoded_fmt_stream_map\": \"([^\"]*)\""); 
      Matcher urlencodMatch = urlencod.matcher(html); 
      if (urlencodMatch.find()) { 
       String url_encoded_fmt_stream_map; 
       url_encoded_fmt_stream_map = urlencodMatch.group(1); 

       // normal embedded video, unable to grab age restricted videos 
       Pattern encod = Pattern.compile("url=(.*)"); 
       Matcher encodMatch = encod.matcher(url_encoded_fmt_stream_map); 
       if (encodMatch.find()) { 
        String sline = encodMatch.group(1); 

        extractUrlEncodedVideos(sline); 
       } 

       // stream video 
       Pattern encodStream = Pattern.compile("stream=(.*)"); 
       Matcher encodStreamMatch = encodStream.matcher(url_encoded_fmt_stream_map); 
       if (encodStreamMatch.find()) { 
        String sline = encodStreamMatch.group(1); 

        String[] urlStrings = sline.split("stream="); 

        for (String urlString : urlStrings) { 
         urlString = StringEscapeUtils.unescapeJava(urlString); 

         Pattern link = Pattern.compile("(sparams.*)&itag=(\\d+)&.*&conn=rtmpe(.*),"); 
         Matcher linkMatch = link.matcher(urlString); 
         if (linkMatch.find()) { 

          String sparams = linkMatch.group(1); 
          String itag = linkMatch.group(2); 
          String url = linkMatch.group(3); 

          url = "http" + url + "?" + sparams; 

          url = URLDecoder.decode(url, "UTF-8"); 

          addVideo(itag, url); 
         } 
        } 
       } 
      } 
     } 

     { 
      Pattern title = Pattern.compile("<meta name=\"title\" content=(.*)"); 
      Matcher titleMatch = title.matcher(html); 
      if (titleMatch.find()) { 
       String sline = titleMatch.group(1); 
       String name = sline.replaceFirst("<meta name=\"title\" content=", "").trim(); 
       name = StringUtils.strip(name, "\">"); 
       name = StringEscapeUtils.unescapeHtml4(name); 
       info.setTitle(name); 
      } 
     } 
    } 

    void extractUrlEncodedVideos(String sline) throws Exception { 
     String[] urlStrings = sline.split("url="); 

     for (String urlString : urlStrings) { 
      urlString = StringEscapeUtils.unescapeJava(urlString); 

      // universal request 
      { 
       String url = null; 
       { 
        Pattern link = Pattern.compile("([^&]*)&"); 
        Matcher linkMatch = link.matcher(urlString); 
        if (linkMatch.find()) { 
         url = linkMatch.group(1); 
         url = URLDecoder.decode(url, "UTF-8"); 
        } 
       } 
       String itag = null; 
       { 
        Pattern link = Pattern.compile("itag=(\\d+)"); 
        Matcher linkMatch = link.matcher(urlString); 
        if (linkMatch.find()) { 
         itag = linkMatch.group(1); 
        } 
       } 
       String sig = null; 
       { 
        Pattern link = Pattern.compile("signature=([^&,]*)"); 
        Matcher linkMatch = link.matcher(urlString); 
        if (linkMatch.find()) { 
         sig = linkMatch.group(1); 
        } else { 
         link = Pattern.compile("signature%3D([^&,%]*)"); 
         linkMatch = link.matcher(urlString); 
         if (linkMatch.find()) { 
          sig = linkMatch.group(1); 
         } 
        } 
       } 

       if (url != null && itag != null && sig != null) { 
        try { 
         new URL(url); 

         if (sig != null) { 
          url += "&signature=" + sig; 
         } 

         if (itag != null) { 
          addVideo(itag, url); 
          continue; 
         } 
        } catch (MalformedURLException e) { 
         // ignore bad urls 
        } 
       } 
      } 
     } 
    } 

    @Override 
    public void extract(VideoInfo info, AtomicBoolean stop, Runnable notify) { 
     try { 
      downloadone(info, stop, notify); 

      getVideo(info, sNextVideoURL); 
     } catch (RuntimeException e) { 
      throw e; 
     } catch (Exception e) { 
      throw new RuntimeException(e); 
     } 
    } 

} 
+0

我們可以用這個作商業用途?我的意思是這裏有任何許可證。 –

+0

@Shabarinath我只能肯定地說我發佈的代碼可以隨意使用。就vget和它所依賴的庫而言,我不確定我可能會認爲他們可以自由地用於商業項目,但不能像商業銷售一樣銷售。我賣vget。但所有這些只是猜測,我建議你檢查他們的許可證,至少是vget和wget的許可證。 – melc

+0

準備好的工作示例鏈接不工作 – Confuse