給定url字符串，我如何儘可能快地將所有字節讀入內存？

大家都給出了url字符串，我想盡可能快地將所有的字節（最多指定的數字n）讀入內存。給定url字符串，我如何儘可能快地將所有字節讀入內存？

我想知道這個問題最好的解決方案是什麼？

我已經想出了兩種解決方案，但是因爲互聯網連接是永遠不變，這是不可能的時間方法，看看哪個更省時高效，所以我是通過右鍵，不知道這這兩個功能應該更省時？：

public static int GetBytes(String url, byte[] destination) throws Exception { //read all bytes (up to destination.length) into destination starting from offset 0 java.io.InputStream input_stream = new java.net.URL(url).openStream(); int total_bytes_read = 0; int ubound = destination.length - 1; while (true) { int data = input_stream.read(); if (data == -1) { break; } destination[total_bytes_read] =(byte) data; if (total_bytes_read == ubound) { break; } ++total_bytes_read; } input_stream.close(); return total_bytes_read; } public static int GetBytes2(String url, byte[] destination) throws Exception { //read all bytes (up to destination.length) into destination starting from offset 0 java.io.InputStream input_stream = new java.net.URL(url).openStream(); int total_bytes_read = 0; while (true) { int bytes_to_read = destination.length - total_bytes_read; if (bytes_to_read == 0) { break; } int bytes_read = input_stream.read(destination, total_bytes_read, bytes_to_read); if (bytes_read == -1) { break; } total_bytes_read += bytes_read; } input_stream.close(); return total_bytes_read; }

測試代碼：

public final class Test { public static void main(String args[]) throws Exception { String url = "http://en.wikipedia.org/wiki/August_2010_in_sports"; // a really huuge page byte[] destination = new byte[3000000]; long a = System.nanoTime(); int bytes_read = GetBytes(url, destination); long b = System.nanoTime(); System.out.println((b - a)/1000000d); } }

我從我的測試代碼了，結果是這樣的：

GetBytes: 12550.803514 12579.65927 12630.308032 12376.435205 12903.350407 12637.59136 12671.536975 12503.170865 GetBytes2: 12866.636589 12372.011314 12505.079466 12514.486199 12380.704728 19126.36572 12294.946634 12613.454368

基本上，我想知道是否有人知道一個更好的方式來閱讀所有字節從一個網址到內存使用盡可能少的時間？

來源

2012-01-17 Pacerier

我會建議你使用JSOUP java的HTML解析器。我用你的代碼使用JSOUP PARSER嘗試了你給定的URL。所花費的時間約爲所花時間的1/4。

 long a = System.nanoTime(); 
     Document doc = Jsoup.connect("http://en.wikipedia.org/wiki/August_2010_in_sports").get(); 
     String title = doc.title(); 
    // System.out.println(doc.html()); // will print whole html code 
     System.out.println(title); 
     long b = System.nanoTime(); 
     System.out.println("Time Taken " + (b - a)/1000000d);

輸出：

August 2010 in sports - Wikipedia, the free encyclopedia 
Time Taken 3842.634244

試試這個。您需要下載JAR files以使用JSOUP。

來源

2012-01-17 05:27:55 vikiiii

與您的互聯網連接Btw，使用上面的測試代碼花了多少時間？ – Pacerier 2012-01-17 06:33:32

拍攝時間3842.634244 – vikiiii 2012-01-17 07:30:53

否我的意思是在我的帖子中使用測試代碼.. – Pacerier 2012-01-17 09:22:08

您一次讀取的字節越多，讀取的速度就越快。每次read（）調用都會輪詢您的輸入設備，並在重複執行時產生大量開銷。 GetBytes2（）比GetBytes（）快。線程也可能會增加您的讀取速度，但最佳解決方案是優化您的算法。

來源

2012-01-17 03:26:58 collinjsimpson

是的，如果我們從輸入設備（如硬盤）讀取數據，會有大量開銷，但通過TCP連接，似乎沒有太大的區別。（看看我的更新後的測試代碼） – Pacerier 2012-01-17 03:41:02

第二種方法更快，但在您的情況下，限制因素絕對是網絡連接的速度。 – MRAB 2012-01-17 03:54:33

會一次讀取2147483647個字節（max int）嗎？另外，如果從URL InputStream中讀取沒有延遲，那麼可能意味着在創建InputStream時URL的內容正被讀入內存中？如果連續的read（）調用不會導致延遲，這似乎是合理的。請嘗試測量分別建立URL和讀取字節所耗用的時間以瞭解更多信息。 – collinjsimpson 2012-01-17 03:57:59

給定url字符串，我如何儘可能快地將所有字節讀入內存？

回答

相關問題