我有以下問題。 我正在使用jSoup從頁面中提取圖像(我正在嘗試下載漫畫),然後轉到下一頁,下載下一個圖像,等等......通常,我從一個按鈕中將URL提取到下一頁:從javascript函數中提取URL
<a href="2.html" class="btn next_page"><span></span>next page</a>
但當漫畫結束的一章,當我點擊頁面上的按鈕,它重定向我下一章通過JavaScript:
<a href="javascript:void(0);" onclick="next_chapter()" class="btn next_page"><span></span>next page</a>
有一些方法來提取鏈接到下一頁?之前有人向我建議Selenium,而我嘗試了幾次並失敗了。也許有人有任何建議?
好了,這裏是我的代碼片段:
while (endManga) {
Document doc = Jsoup.connect(link).get();
String title = doc.title();
System.out.println(title);
Element nextButtonDiv = doc.getElementById("top_center_bar");
Elements nextButton = nextButtonDiv.select("a[href]");
if (nextButton.isEmpty())
endManga = true;
else {
Element nextLinkElement = nextButton
.get(nextButton.size() - 1);
String nextLink;
//here is the problem - at some point, when one chapter ends,
//there isn't link to the next one, only "onclick="next_chapter()"" javascript function
if (nextLinkElement.attr("href").length() < 10)
nextLink = nextLinkElement.attr("abs:href");
else
nextLink = nextLinkElement.attr("href");
link = nextLink;
}
Element content = doc.getElementById("viewer");
Elements jpgs = content.select("img[src$=.jpg]");
BufferedImage image = null;
if (jpgs.isEmpty()) {
System.out.println("empty!!");
counterVolume++;
} else {
for (Element imageURL : jpgs) {
image = ImageIO.read(new URL(imageURL.attr("src")));
ImageIO.write(image, "jpg", new File("manga/"
+ counterVolume + "_" + counterPage++ + ".jpg"));
System.out.println("zgrane - volume: " + counterVolume
+ " , page: " + counterPage);
}
}
}
的,這裏是我的代碼,我在那裏用硒:
WebDriver driver = new HtmlUnitDriver();
driver.get("link_to_page_with_javascript_function");
WebElement element = driver.findElement(By.id("top_center_bar"));
List<WebElement> el = element.findElements(By.tagName("a"));
System.out.println(element.getTagName());
for(WebElement e : el){
if(e.getText().equals("next page")){
//here I have the button, which clicked redirects me to next chapter
//how can I extract the link from this function??
e.click();
}
}
想要在點擊元素之前找出下一頁**的網址嗎? – Louis
@Louis我的印象是,這個問題是特定於JavaScript而不是。對不起,如果這是錯誤的,隨時恢復,如果你認爲它是適當的。 – mafu
我不想單擊它。我想在不打開瀏覽器的情況下獲取網址。我想從下一頁使用jSoup獲取鏈接以提取下一張圖片。我不知道你是否得到它;如果沒有,我會將代碼片段包含到我的應用程序中。 – Dess