請參閱method find_all_links
in WWW::Mechanize。無需使用解析器手動打擾。你可能想放鬆正則表達式,這樣你就可以同時獲得大約1000個可能的團隊。
use WWW::Mechanize qw();
my $w = WWW::Mechanize->new;
$w->get('http://www.soccerbase.com/teams/home.sd');
for my $link ($w->find_all_links(url_regex => qr/comp_id=1\b/)) {
# 20 instances of WWW::Mechanize::Link
printf "URL=%s\tTeam=%s\n", $link->url_abs, $link->text
}
URL=http://www.soccerbase.com/tournaments/tournament.sd?comp_id=1 Team=Premier League
URL=http://www.soccerbase.com/teams/team.sd?team_id=142&comp_id=1 Team=Arsenal
URL=http://www.soccerbase.com/teams/team.sd?team_id=154&comp_id=1 Team=Aston Villa
URL=http://www.soccerbase.com/teams/team.sd?team_id=308&comp_id=1 Team=Blackburn
URL=http://www.soccerbase.com/teams/team.sd?team_id=354&comp_id=1 Team=Bolton
URL=http://www.soccerbase.com/teams/team.sd?team_id=536&comp_id=1 Team=Chelsea
URL=http://www.soccerbase.com/teams/team.sd?team_id=942&comp_id=1 Team=Everton
URL=http://www.soccerbase.com/teams/team.sd?team_id=1055&comp_id=1 Team=Fulham
URL=http://www.soccerbase.com/teams/team.sd?team_id=1563&comp_id=1 Team=Liverpool
URL=http://www.soccerbase.com/teams/team.sd?team_id=1718&comp_id=1 Team=Man City
URL=http://www.soccerbase.com/teams/team.sd?team_id=1724&comp_id=1 Team=Man Utd
URL=http://www.soccerbase.com/teams/team.sd?team_id=1823&comp_id=1 Team=Newcastle
URL=http://www.soccerbase.com/teams/team.sd?team_id=1855&comp_id=1 Team=Norwich
URL=http://www.soccerbase.com/teams/team.sd?team_id=2093&comp_id=1 Team=QPR
URL=http://www.soccerbase.com/teams/team.sd?team_id=2477&comp_id=1 Team=Stoke
URL=http://www.soccerbase.com/teams/team.sd?team_id=2493&comp_id=1 Team=Sunderland
URL=http://www.soccerbase.com/teams/team.sd?team_id=2513&comp_id=1 Team=Swansea
URL=http://www.soccerbase.com/teams/team.sd?team_id=2590&comp_id=1 Team=Tottenham
URL=http://www.soccerbase.com/teams/team.sd?team_id=2744&comp_id=1 Team=West Brom
URL=http://www.soccerbase.com/teams/team.sd?team_id=2783&comp_id=1 Team=Wigan
URL=http://www.soccerbase.com/teams/team.sd?team_id=2848&comp_id=1 Team=Wolves
再次你已經解決了我的問題,太感謝你了! – blacky 2012-02-22 01:02:25
好吧,我遇到了一個小問題,當我嘗試運行它時,出現此錯誤: 無法通過軟件包「HTTP :: Headers」在/ usr/local找到對象方法「find_all_links」 /ActivePerl-5.14/lib/HTTP/Message.pm 649行。 任何想法爲什麼? – blacky 2012-02-22 13:18:02
所有我需要從這是字符串數組中的url,似乎無法得到它的工作,也得到此錯誤:無法找到對象方法「url_abs」通過包「WWW :: Mechanize」在solution.pl線11. – blacky 2012-02-22 14:31:55