無法從網頁讀取html

我想從網頁獲取一些數據（html標籤），但我不能。出於某種原因，我只是主要是空標籤。無法從網頁讀取html

這是網址：http://www.miamidade.gov/transit/mobile/routes.asp

這是我的Java代碼：

import java.io.IOException; 
import org.jsoup.Jsoup; 
import org.jsoup.nodes.Document; 
class xyz{ 
    public static void main (String[] args) throws IOException { 
     Document doc = jsoup.connect("http://www.miamidade.gov/transit/mobile/routes.asp").userAgent(" Mozilla/5.0").timeout(3000).post(); 
     String title = doc.html(); 
     System.out.print(title); 
    } 
}

來源

2011-07-11 skinnycat

嘗試這樣

Document doc = Jsoup.parse("http://www.miamidade.gov/transit/mobile/routes.asp",10000); 
System.out.print(doc.toString());

可能是超時時間是不夠的，你

頁

來源

2011-07-11 09:49:43 Rasel

我沒有工作...的

的源代碼併爲你工作？請讓我知道... – skinnycat

是的，我的工作方式。雖然我沒有測試它與您的網頁 – Rasel

它也適用於我...也謝謝。 – skinnycat

頁面在http://www.miamidade.gov/transit/mobile/routes.asp首先做一個JavaScript重定向到「s criptCheck.asp？script = yes & CurrentPage =/transit/mobile/routes.asp？「。然後，它最終會再次使用您在頁面上看到的信息重新加載http://www.miamidade.gov/transit/mobile/routes.asp。 Jsoup似乎無法處理該重定向，因此您的代碼會獲取第一個頁面，並返回該HTML，與使用瀏覽器時看到的不同。也許這就是爲什麼你沒有找到你期望的信息。第一頁

<html> 
<head> 
    <title></title> 
    <script language="JavaScript"> 
<!-- 
window.location="scriptCheck.asp?script=yes&CurrentPage=/transit/mobile/routes.asp?"; 
//--> 

    </script> 
</head> 
<body> 
    <noscript> 
    <meta http-equiv="Refresh" content="0;URL=scriptCheck.asp?script=no&amp;CurrentPage=/transit/mobile/routes.asp?" /> 
    </noscript> 
    <noscript> 
    <br /> 
    <br /> 
    <a href="scriptCheck.asp?script=no&amp;CurrentPage=/transit/mobile/routes.asp?">Enter MDT Mobile Services Site</a> 
    <br /> 
    <br /> 
    </noscript> 
</body> 
</html>

來源

2011-07-11 10:27:20 aldrin

問題：你是怎麼知道頁面直接到達的？ – skinnycat

它在頁面的源代碼 – aldrin

我怎麼沒有看到相同的html代碼，你看到了...我使用的是鉻合金 – skinnycat

無法從網頁讀取html

回答

相關問題