執行JavaScript在Java中 - 打開一個URL，並獲取鏈接

import javax.script.ScriptEngine; 
import javax.script.ScriptEngineManager; 
import java.io.FileReader; 

public class Main { 

    public static void main(String[] args) { 

     ScriptEngineManager manager = new ScriptEngineManager(); 
     ScriptEngine engine = manager.getEngineByName("js"); 
     try { 
      FileReader reader = new FileReader("C:/yourfile.js"); 
      engine.put("urlfromjava", "http://www.something.com/?asvb"); 
      engine.eval(reader); 
      reader.close(); 
     } catch (Exception e) { 
      e.printStackTrace(); 
     } 
    } 
}

眼下，yourfile.js包含此行執行JavaScript在Java中 - 打開一個URL，並獲取鏈接

function urlget(url) 
{ 
    print("URL:"+url); 
    var loc = window.open(url); 
    var link = document.getElementsByTagName('a')["61"].href; 
    return ("\nLink is: \n"+link); 

} 
var x = urlget(urlfromjava); 
print(x);

我得到的錯誤

"javax.script.ScriptException: sun.org.mozilla.javascript.internal.EcmaError: ReferenceError: "window" is not defined"

如何打開一個URL並從java獲得它的鏈接？

來源

2011-05-22 harihb

按照documentation：

窗口對象表示在瀏覽器打開的窗口。

由於您未在瀏覽器中執行腳本，因此未定義窗口對象。

您可以使用URL/URLConnecion類讀取URL並將其提供給ScriptEngine。 There is a tutorial here。

來源

2011-05-22 09:51:24 iruediger

哇！看起來像我已經回答了相同的:) – Tapos 2011-05-22 09:54:21

「偉大的頭腦相似」 – iruediger 2011-05-22 11:40:44

我喜歡的答案，除了，w3schools是像維基百科或隨機的網頁搜索結果一樣多的「文檔」。所以這個答案的前兩行是不正確的。 – Kit10 2014-01-09 15:22:53

在JavaScript中window表示瀏覽器窗口。所以當你試圖從Java執行這個js時，它無法找到瀏覽器窗口並且出現錯誤。您可以使用Java中的URL類來獲取URL的內容。

來源

2011-05-22 09:53:16 Tapos

實際上，URL的內容具有超鏈接，我只能通過使用document.getElementByTagName（'a'）; 所以，我需要加載內存中的網址，做到這一點，並獲得鏈接 – harihb 2011-05-22 10:11:06

你可以使用正則表達式模式解析字符串。 – Tapos 2011-05-22 10:36:48

鏈接不在頁面的源代碼中。它通過在服務器端執行的JavaScript加載。 – harihb 2011-05-23 08:22:28

您可以在犀牛嵌入Env.js獲得這種功能

來源

2011-05-22 10:04:06 Grooveek

5年前他們似乎已停止工作 – 2016-04-11 14:58:25

試試這個：

import java.net.*; 
import java.io.*; 
    public class URLConnectionReader { 
    public static void main(String[] args) throws Exception { 
     URL yahoo = new URL("http://www.yahoo.com/"); 
     URLConnection yc = yahoo.openConnection(); 
     BufferedReader in = new BufferedReader( 
      new InputStreamReader( 
      yc.getInputStream())); 
     String inputLine; 
     while ((inputLine = in.readLine()) != null) 
      System.out.println(inputLine);// or save to some StringBuilder like this: sb.append(inputLine); then pass the sb.toString() to the method that gets links out of it - > see getLinks below 
     in.close(); 
     } 
    } 



private static final String CLOSING_QUOTE = "\""; 
private static final String HREF_PREFIX  = "href=\""; 
private static final String HTTP_PREFIX  = "http://"; 



public static Set<String> getLinks(String page) { 
    Set<String> links = new HashSet<String>(); 
    String[] rawLinks = StringUtils.splitByWholeSeparator(page, HREF_PREFIX); 
    for (String str : rawLinks) { 
     if(str.startsWith(HTTP_PREFIX)) { 
      links.add(StringUtils.substringBefore(str, CLOSING_QUOTE)); 
     } 
    } 
    return links; 
}

來源

2011-05-23 06:01:29 aviad

抱歉，我無法格式化代碼標記 - 瀏覽器問題... @ Apache Fan - 您是否介意再次執行您的操作？ – aviad 2011-05-23 06:32:01

問題是，頁面中的鏈接是由javascript生成的。所以只有在URL加載後，鏈接纔會到達。即它不在html文件的源代碼中。這就是爲什麼在加載url之後，我使用document.getElementByTagName（'a'）而不是在java中使用URL類來提取鏈接。 – harihb 2011-05-23 08:21:04

URL.openConnection模擬客戶端瀏覽器的功能，因此您可以獲得與瀏覽器完全相同的標記。嘗試一下，我相信你會看到它的作品。如果我不讓我知道你得到了什麼，我們可以嘗試進一步解決問題。 – aviad 2011-05-23 08:29:11

執行JavaScript在Java中 - 打開一個URL，並獲取鏈接

回答

相關問題