我對編程非常陌生,我只是閱讀文檔。 對於我閱讀過一些Perl書籍和PHP-Cookbook的小項目。 但我選擇了一些食譜,並相信與否:它在屏幕上看起來很可怕。我想我現在需要一些幫助 -微小可運行www ::機械化初學者示例
用我的小知識是很難做的工作......我需要一些機械化食譜工作,因爲下面的一些例子已經過時:
see the cpan-site for the mechanize examples
我很想了解更多 - 用活生生的例子 - 你有更多....
我喜歡聽你從
我對編程非常陌生,我只是閱讀文檔。 對於我閱讀過一些Perl書籍和PHP-Cookbook的小項目。 但我選擇了一些食譜,並相信與否:它在屏幕上看起來很可怕。我想我現在需要一些幫助 -微小可運行www ::機械化初學者示例
用我的小知識是很難做的工作......我需要一些機械化食譜工作,因爲下面的一些例子已經過時:
see the cpan-site for the mechanize examples
我很想了解更多 - 用活生生的例子 - 你有更多....
我喜歡聽你從
你可能會在你究竟是什麼更具體一點之後......對例如,這是一個腳本登錄到一個網站:
#call the mechanize object, with autocheck switched off
#so we don't get error when bad/malformed url is requested
my $mech = WWW::Mechanize->new(autocheck=>0);
my %comments;
my %links;
my @comment;
my $target = "http://google.com";
#store the first target url as not checked
$links{$target} = 0;
#initiate the search
my $url = &get_url();
#start the main loop
while ($url ne "")
{
#get the target url
$mech->get($url);
#search the source for any html comments
my $res = $mech->content;
@comment = $res =~ /<!--[^>]*-->/g;
#store comments in 'comments' hash and output it on the screen, if there are any found
$comments{$url} = "@comment" and say "\n$url \n---------------->\n $comments{$url}" if $#comment >= 0;
#loop through all the links that are on the current page (including only urls that are contained in html anchor)
foreach my $link ($mech->links())
{
$link = $link->url();
#exclude some irrelevant stuff, such as javascript functions, or external links
#you might want to add checking domain name, to ensure relevant links aren't excluded
if ($link !~ /^(#|mailto:|(f|ht)tp(s)?\:|www\.|javascript:)/)
{
#check whether the link has leading slash so we can build properly the whole url
$link = $link =~ /^\// ? $target.$link : $target."/".$link;
#store it into our hash of links to be searched, unless it's already present
$links{$link} = 0 unless $links{$link};
}
}
#indicate we have searched this url and start over
$links{$url} = 1;
$url = &get_url();
}
sub get_url
{
my $key, my $value;
#loop through the links hash and return next target url, unless it's already been searched
#if all urls have been searched return empty, ending the main loop
while (($key,$value) = each(%links))
{
return $key if $value == 0;
}
return "";
}
:
use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
my $url = "http://www.test.com";
$mech->cookie_jar->set_cookie(0,"start",1,"/",".test.com");
$mech->get($url);
$mech->form_name("frmLogin");
$mech->set_fields(user=>'test',passwrd=>'test');
$mech->click();
$mech->save_content("logged_in.html");
這是執行谷歌的腳本從每一個網頁搜索
use WWW::Mechanize;
use 5.10.0;
use strict;
use warnings;
my $mech = new WWW::Mechanize;
my $option = $ARGV[$#ARGV];
#you may customize your google search by editing this url (always end it with "q=" though)
my $google = 'http://www.google.co.uk/search?q=';
my @dork = ("inurl:dude","cheese");
#declare necessary variables
my $max = 0;
my $link;
my $sc = scalar(@dork);
#start the main loop, one itineration for every google search
for my $i (0 .. $sc) {
#loop until the maximum number of results chosen isn't reached
while ($max <= $option) {
$mech->get($google . $dork[$i] . "&start=" . $max);
#get all the google results
foreach $link ($mech->links()) {
my $google_url = $link->url;
if ($google_url !~ /^\// && $google_url !~ /google/) {
say $google_url;
}
}
$max += 10;
}
}
簡單的站點爬蟲提取信息(HTML註釋)
這真的取決於你以後的樣子,但如果你想要更多的例子,我會推薦你perlmonks.org,在那裏你可以找到很多材料讓你去。
絕對收藏此雖然mechanize module man page,這是最終的資源...
您好Cyber-Guard設計!你回答K !!!這比預期的要多。真。我將在當天晚些時候仔細研究一下這個例子。再次 - 非常感謝這一卓越的幫助。你是一天中的人!當然可以!很多問候阿波羅曼 – zero 2010-11-22 00:29:20
有什麼不對,作者提供了WWW::Mechanize::Cookbook和WWW::Mechanize::Examples頁面?
嘗試詢問有關您嘗試解決的編程問題的具體問題。很難回答一般要求食譜的問題。 – 2010-11-22 21:25:06