Python + Selenium + PhantomJS渲染爲PDF

當PhantomJS與Selenium和Python結合使用時，是否可以使用PhantomJS's呈現PDF功能？（即通過Selenium在Python中模仿page.render('file.pdf')行爲）。Python + Selenium + PhantomJS渲染爲PDF

我意識到這使用GhostDriver和GhostDriver並不真正支持很多的打印方式。

如果另一種選擇是不可能的硒不是我的耳朵。

來源

2013-06-04 Rejected

你看過Pypdf2嗎？ http://www.blog.pythonlibrary.org/tag/python-pdf-series/ – Amit

@Amit：相當廣泛，因爲我一直都在使用它。即使Phaseit自己也說過「PyPDF2沒有HTML知識」。它不會可靠地呈現任何HTML。 – Rejected

@Rejected是否需要在測試期間以確切的狀態進行截圖？或者你只是想加載一個頁面並渲染爲PDF？ –

你可以使用selenium.selenium.capture_screenshot('file.png')但這會給你一個PNG屏幕截圖而不是pdf。似乎沒有辦法將屏幕截圖作爲pdf。

這裏是capture_screenshot文檔：http://selenium.googlecode.com/git/docs/api/py/selenium/selenium.selenium.html?highlight=screenshot#selenium.selenium.selenium.capture_screenshot

來源

2014-03-30 22:29:43 KAtkinson

PDF是一個關鍵因素。由於衆多原因，例如文本搜索，表單，嵌入式媒體等，我無法下載到一個簡單的圖像。 – Rejected

試過pdfkit？它可以從HTML頁面呈現PDF文件。

來源

2014-04-02 10:13:24 moodh

我也研究過它。 PDFKit轉換HTML - > PDF，但沒有其他功能。內容分析，以確定頁面是否包含PDF之前所需的內容悲傷是不可能的。 – Rejected

是的，我有與PDFKit相同的問題，我想要更高級的渲染，使用它與JS框架相當麻煩.. :( – moodh

「內容分析，以確定一個頁面是否包含所需的內容」 - >好吧，你不能自己做內容分析，如果它匹配，那麼你只需發送它與pdfkit呈現。這就是我如何做到這一點。 – Jonathan

@rejected，我知道你提到不想使用的子過程，但是......

你實際上可能能夠利用比你預期的子進程通信等等。理論上，您可以採取Ariya's stdin/stdout example並將其擴展爲相對通用的包裝腳本。它可能會首先接受要加載的頁面，然後在該頁面上偵聽（&執行）您的測試操作。最終，你可能會揭開序幕，.render甚至做出錯誤處理的一般拍攝：

try { 
    // load page & execute stdin commands 
} catch (e) { 
    page.render(page + '-error-state.pdf'); 
}

來源

2014-04-04 18:20:25

執行通過stdin接收的代碼需要通過'eval'完成，並且從我嘗試這樣做的經驗中，它都是inse治癒和不可靠。除非我錯了？ – Rejected

儘管您希望對您的輸入保持謹慎（從可靠性的角度來看），但您可能不必擔心安全問題，因爲（我假設）您擁有該流程。 –

你也可以列出特定的命令等，以便更快地處理意外錯誤。然而，我設想的最佳場景是，您可以將屏幕截圖之前可能發生的測試（或其他邏輯）提取到單獨的.js文件並將其加載到頁面中（http://phantomjs.org/api/phantom/）方法/注入-js.html）。你可以讓Python最大傳遞一個arg來加載特定的文件JS。 –

下面是使用硒和特殊命令GhostDriver 溶液（它應該工作，因爲GhostDriver 1.1.0和1.9 PhantomJS。 6，測試與PhantomJS 1.9.8）：

#!/usr/bin/env python 
# -*- coding: utf-8 -*- 

"""Download a webpage as a PDF.""" 


from selenium import webdriver 


def download(driver, target_path): 
    """Download the currently displayed page to target_path.""" 
    def execute(script, args): 
     driver.execute('executePhantomScript', 
         {'script': script, 'args': args}) 

    # hack while the python interface lags 
    driver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute') 
    # set page format 
    # inside the execution script, webpage is "this" 
    page_format = 'this.paperSize = {format: "A4", orientation: "portrait" };' 
    execute(page_format, []) 

    # render current page 
    render = '''this.render("{}")'''.format(target_path) 
    execute(render, []) 


if __name__ == '__main__': 
    driver = webdriver.PhantomJS('phantomjs') 
    driver.get('http://stackoverflow.com') 
    download(driver, "save_me.pdf")

也看到我的回答對同一問題here。

來源

2015-02-01 23:21:30 MTuner

Python + Selenium + PhantomJS渲染爲PDF

回答

相關問題