使用Perl腳本檢索https://examle.com url

我嘗試將整個網頁保存在我的系統上作爲.HTML文件，然後解析該文件，找到一些標籤並使用它們。我能夠保存/解析http：/ url，但無法保存/解析https：/ url。我正在使用Perl。我使用下面的代碼來保存http，它工作正常。但不適用於https。是否可以解析HTTPS頁面?? ..：使用Perl腳本檢索https://examle.com url

use strict; 
use warnings; 
use LWP::Simple qw($ua get); 
use LWP::UserAgent; 
use LWP::Protocol::https; 
use HTTP::Cookies; 

sub main 
{ 
    my $ua = LWP::UserAgent->new(); 

    my $cookies = HTTP::Cookies->new(
    file => "cookies.txt", 
    autosave => 1, 
    ); 

    $ua->cookie_jar($cookies); 

    $ua->agent("Google Chrome/30"); 


#$ua->ssl_opts(SSL_ca_file => 'cert.pfx'); 

    $ua->proxy('http','http://proxy.com'); 
    my $response = $ua->get('http://google.com'); 

#$ua->credentials($response, "", "usrname", "password"); 

    unless($response->is_success) { 
    print "Error: " . $response->status_line; 
    } 


    # Let's save the output. 
    my $save = "save.html"; 

    unless(open SAVE, '>' . $save) { 
    die "nCannot create save file '$save'n"; 
    } 

    # Without this line, we may get a 
    # 'wide characters in print' warning. 
    binmode(SAVE, ":utf8"); 

    print SAVE $response->decoded_content; 

    close SAVE; 

    print "Saved ", 
     length($response->decoded_content), 
     " bytes of data to '$save'."; 
} 

main();

來源

2013-10-18 Bharath Keshava

運行此單線程的任何錯誤？ 'perl -MLWP :: UserAgent -e'$ ua = LWP :: UserAgent-> new; print $ ua-> get（「https://github.com」） - > decode_content（）;'' – Suic

你需要有https://metacpan.org/module/Crypt::SSLeay用於HTTPS鏈接

它提供了LWP SSL支持。

咬我一個屁股，我自己的項目。

來源

2013-10-18 07:49:47 lsiebert

其實在較新的版本[libwww-perl]（https://metacpan.org/release/libwww-perl），您需要確保[LWP :: Protocol :: https]（https://metacpan.org/module/LWP :: Protocol :: https）被安裝（安裝它會強制SSL模塊也被安裝） –

很酷，我不知道。 – lsiebert

永遠值得檢查您所使用的模塊的文檔...

您正在使用從libwww-perl模塊。這包括a cookbook。而在這本食譜，有a section about HTTPS，它說：

網址以https方案完全相同的方式與 HTTP協議進行訪問，提供了LWP的SSL接口模塊已經正確安裝（見README.SSL文件可在libwww-perl 發行版中找到以獲取更多詳細信息）。如果沒有安裝 LWP使用的SSL接口，則在訪問此類URL時，您將收到「501協議方案」https不是「支持」的錯誤。

的README.SSL文件這樣說：

作爲的libwww-perl的V6.02，你需要安裝LWP ::協議:: HTTPS從自己獨立的配送模塊啓用支持https：// ... LWP :: UserAgent的URL。

所以你只需要安裝LWP::Protocol::https。

來源

2013-10-18 10:04:51

使用Perl腳本檢索https://examle.com url

回答

相關問題