打開連接時,如何找出使用的最佳URL格式?使用URL.openConnection()時,處理URL變化(如「www」和「https」)的最佳方法是什麼?
許多網站返回基於URL是否使用「WWW」和/或「https」不同的結果。
例如,這裏有一個測試,我寫看到一些不同的結果:
import java.util.Scanner;
import java.util.ArrayList;
import java.net.*;
import java.io.*;
public class Test {
public static void main(String[] args)
{
String baseURL = "google.com";
try
{
java.net.URL url = new java.net.URL("http://" + baseURL);
java.net.URLConnection connection = url.openConnection();
connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36");
BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line;
int lineCount = 0;
while ((line = in.readLine()) != null)
{
lineCount++;
}
System.out.println("http://" + baseURL + " = " + lineCount + " lines");
}
catch (Exception ex)
{
System.out.println("http://" + baseURL + " throws an error");
}
try
{
java.net.URL url = new java.net.URL("http://www." + baseURL);
java.net.URLConnection connection = url.openConnection();
connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36");
BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line;
int lineCount = 0;
while ((line = in.readLine()) != null)
{
lineCount++;
}
System.out.println("http://www." + baseURL + " = " + lineCount + " lines");
}
catch(Exception ex)
{
System.out.println("http://www." + baseURL + " throws an error");
}
try
{
java.net.URL url = new java.net.URL("https://" + baseURL);
java.net.URLConnection connection = url.openConnection();
connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36");
BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line;
int lineCount = 0;
while ((line = in.readLine()) != null)
{
lineCount++;
}
System.out.println("https://" + baseURL + " = " + lineCount + " lines");
}
catch (Exception ex)
{
System.out.println("https://" + baseURL + " throws an error");
}
try
{
java.net.URL url = new java.net.URL("https://www." + baseURL);
java.net.URLConnection connection = url.openConnection();
connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36");
BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line;
int lineCount = 0;
while ((line = in.readLine()) != null)
{
lineCount++;
}
System.out.println("https://www." + baseURL + " = " + lineCount + " lines");
}
catch (Exception ex)
{
System.out.println("https://www." + baseURL + " throws an error");
}
}
}
這裏是4個不同網站運行它的結果:
http://stackoverflow.com = 4205 lines
http://www.stackoverflow.com = 4205 lines
https://stackoverflow.com = 4205 lines
https://www.stackoverflow.com = 2 lines
http://qvc.com = 2438 lines
http://www.qvc.com = 2438 lines
https://qvc.com throws an error
https://www.qvc.com = 0 lines
http://facebook.com = 0 lines
http://www.facebook.com = 0 lines
https://facebook.com = 25 lines
https://www.facebook.com = 25 lines
http://google.com = 6 lines
http://www.google.com = 6 lines
https://google.com = 343 lines
https://www.google.com = 343 lines
給定一個基礎URL,如「google.com」,有什麼檢查,看看我應該使用的網站,格式的正確方法是什麼?
據推測,在http答覆是重定向到安全的HTTPS協議。 –
檢查響應碼。如果你得到一個重定向,那麼你可能使用了錯誤的格式。例如'www.stackoverflow.com'將發佈301重定向到'stackoverflow.com'。 –
@MarcB - 是的,我覺得它會是這樣的。你能把它作爲答案發布嗎? – Pikamander2