IM建立一個小的代理爲我自己用LWP :: UserAgent的使用,然後我解析使用HTML :: TreeBuilder作爲經過HTML解析(使用HTML :: TreeBuilder作爲)圖像會損壞
視的HTML我的網頁,我的代碼加載一個小模塊,獲取信息/更改顯示,然後我再次打印我的HTML。
剛開始時,我只是在我的模塊中加載HTML :: TreeBuilder,在那裏修改,然後從那裏打印回HTML,一切都很好。
現在我在調用小模塊的代碼中構建樹(小模塊仍在修改它)並從此處打印HTML,並且所有圖像都有錯誤(來自螢火蟲)圖像已損壞或被截斷,並不加載。
這裏是我使用的簡化代碼。
不工作代碼:
$info{content}=$response->content;
$info{tree} = HTML::TreeBuilder->new_from_content($info{content}) or die $!;
do module.pm #modify the tree
$info{content} = $info{tree}->as_HTML(undef,"\t");
$info{tree}->delete();
return \$info{content};
,並在module.pm
my $elem = $info{tree}->look_down(_tag => "img");
$elem->attr('width', '240');
$elem->attr('height', '60');
和模塊的工作代碼
$info{content}=$response->content;
do module.pm #modify the tree
return \$info{content};
:
use strict;
use warnings;
use HTML::TreeBuilder; # Ensure weak references in use
my $tree = HTML::TreeBuilder->new_from_content($awproxy::process::info{content}) or die $!;
my $elem = $tree->look_down(_tag => "img");
$elem->attr('width', '240');
$elem->attr('height', '60');
$awproxy::process::info{content} = $tree->as_HTML(undef,"\t");
$tree->delete();
1;
任何人都有一個想法,它可能來自哪裏?
而返回的HTML從兩個代碼是完全一樣的
編輯:我的所有代碼使用。
main.pm通過perltranshandler
package awproxy::main;
use strict;
use warnings;
use Apache2::Const qw(:common);
use Apache2::RequestRec;
use Apache2::RequestIO;
use awproxy::process;
my $destdomain="domain.com";
my $desthost="www1.domain.com";
my $wwwdesthost="www.domain.com";
sub handler {
my ($r) = @_;
$r->handler("perl-script");
$r->set_handlers(PerlHandler => \&proxy_handler);
return OK;
}
sub proxy_handler {
my($r) = @_;
$r->status(200);
$r->content_type("text/plain");
my $ourhost="aw.mydomain.fr.cr";
my $wwwourhost="awww.mydomain.fr.cr";
my $result=awproxy::process::process($r);
my $dest;
my $headers_in = $r->headers_in;
my $host=$headers_in->get("Host");
if($host=~/^www\.a/) { # matches $wwwourhost
$dest=$wwwdesthost;
} else {
$dest=$desthost;
}
# filter headers_out as with ProxyPassReverse
my $h=$r->headers_out();
foreach my $k (qw(Content-Location Location URI)) {
my $l=$h->get($k);
if($l && ($l=~s!(http://)$desthost!$1$ourhost! || $l=~s!(http://)$wwwdesthost!$1$wwwourhost!)) {
$h->set($k,$l);
}
}
# cookie reverse modification
for my $k ("Set-Cookie") {
my @l=$h->get($k);
foreach my $cookie (@l) {
if($cookie=~s/$desthost/$ourhost/ || $cookie=~s/$wwwdesthost/$wwwourhost/ || $cookie=~s/domain=$destdomain/host=$ourhost/) {
$h->add($k, $cookie);
}
}
}
if($result) {
$r->print($$result);
}
$_=undef; # clear private data
undef %awproxy::process::info;
return OK;
}
1;
由先前的功能
package awproxy::process;
use strict;
use warnings;
use Apache2::Connection; #permet de recup l'ip
use LWP::UserAgent; #pour les connexion
use APR::Table;
use DBI;
use HTML::TreeBuilder;
sub process {
my $desthost="www1.domain.com";
my $wwwdesthost="www.domain.com";
my $ourhost="aw.mydomain.fr.cr";
my $wwwourhost="awww.mydomain.fr.cr";
my $destdomain="aw.mydomain.fr.cr";
my $dir="/usr/lib/perl5/awproxy/";
our %info;
my $r = shift;
our $dbh=DBI->connect('DBI:mysql:XXXX', 'XXXX', 'XXXX'
) || die "Could not connect to database: $DBI::errstr";
#pour recup l'ip
my $c=$r->connection();
my $ip=$c->remote_ip();
# autodetect $ourhost value from input headers
my $headers_in = $r->headers_in;
my $host=$headers_in->get("Host");
my $dest;
if($host=~/^www\.a/) { # matches $wwwourhost
$dest=$wwwdesthost;
} else {
$dest=$desthost;
}
#Pour creer la requete
my $ua = LWP::UserAgent->new();
$ua->agent('Mwoua/proxy');
my $method = uc($r->method);
my $request = HTTP::Request->new($method,"http://".$dest.$r->unparsed_uri);
while(my($key,$val) = each %$headers_in) {
next if($key eq "Host"); # do not override host header
next if($key eq "Accept-Encoding");
$request->header($key,$val);
}
#on ajoute l'ip
if(!$request->header("X-Forwarded-For"))
{
$request->header("X-Forwarded-For", $ip);
$request->header("X-Forwarded-Host", $headers_in->{Host});
}
#on ajoute les donnees POST
if($request->header("Content-Length")) {
my $postdata;
$r->read($postdata,$request->header("Content-Length"));
$request->content($postdata);
}
my $response = $ua->request($request);
if(!$response)
{
$r->status(500);
$r->print("sorry: something went wrong on the aw-side of proxy\n");
return;
}
$r->content_type($response->header('Content-type'));
my $headers_out=[];
$response->scan(sub {
if(lc $_[0] ne "connection") {
$r->headers_out->add(@_);
push(@$headers_out, \@_);
}
});
#Ce dont les modules peuvent avoir besoin
$info{setcookie}=$response->header('Set-Cookie');
$info{content}=$response->content;
#On modifie les liens
$info{content}=~s!(http-equiv="refresh"[^>]*url=http://)$desthost!$1$ourhost!i;
$info{content}=~s!(http-equiv="refresh"[^>]*url=http://)$wwwdesthost!$1$wwwourhost!i;
$info{content}=~s!(<a[^>]* href="?http://)$desthost!$1$ourhost!gi;
$info{content}=~s!(<a[^>]* href="?http://)$wwwdesthost!$1$wwwourhost!gi;
$info{content}=~s!(<img[^>]* src="?http://)$desthost!$1$ourhost!gi;
$info{content}=~s!(<form action="?http://)$desthost!$1$ourhost!gi;
#$info{tree} = HTML::TreeBuilder->new_from_content($info{content}) or die $!;
#on regarde ou on est, et on applique les modifs
my $include=$r->uri;
if($info{content} =~ m!<b>Security Measure</b>!)
{
$include=$dir."security.pm";
}
else
{
$include =~ s/\.php$//i ;
$include =~ s/\/$// ;
$include=$dir.$dest.$include.'.pm';
}
#$info{content}=$include.$info{content};
if(-e $include)
{
require $include; #same with do $include;
}
#$info{content} = $info{tree}->as_HTML(undef," ");
$dbh->disconnect();
undef $dbh;
#$info{tree}->delete();
#undef $info{tree};
return \$info{content};
}
1;
,我做我的測試與此刻的模塊調用的過程稱爲:
use strict;
use warnings;
use HTML::TreeBuilder; # Ensure weak references in use
my $tree = HTML::TreeBuilder->new_from_content($awproxy::process::info{content}) or die $!;
my $elem = $tree->look_down(_tag => "img");
$elem->attr('width', '240');
$elem->attr('height', '60');
$awproxy::process::info{content} = $tree->as_HTML(undef,"\t");
$tree->delete();
注意:這是工作版本,您可以使用info ip輕鬆將其更改爲非工作版本(在process.pm中創建樹並使用此創建的樹或在其他模塊中進行修改)
那麼,調用的模塊依賴於被代理的頁面,可以導出器允許嗎? (即根據參數執行一段代碼)。 我有幾個功能,我代理的每一頁,所以我不想創建一個只有功能的文件,幷包括他們。 你有什麼想法爲什麼圖像的GET不起作用(CSS得到罰款)? – 2012-07-09 22:47:06
@David Levy:沒有看到你的代碼,我真的無法幫助。在它的表面上,我不明白爲什麼你不能把你當前模塊中的所有東西放到一個子例程中並導出它。同樣,我不能說出爲什麼你的代理沒有看到你的代碼沒有正確傳遞圖像。它是二進制數據,而HTML和CSS可能是普通的ASCII碼 - 你處理二進制數據嗎?我建議你發佈你的代碼,以便我們可以幫助更好。 – Borodin 2012-07-10 13:17:23
如果您有時間查看,所有代碼都在我的第一篇文章中。 – 2012-07-26 09:06:04