2010-06-25 19 views
2

我正在取得進展,但我遇到了一個新問題。在Perl上得到一個Bareword錯誤教程

這是新代碼:

#!/usr/bin/perl -w 
use strict; 
use LWP::Simple; 
use HTML::TreeBuilder; 

my $url = 'http://oreilly.com/store/complete.html'; 
my $page = get($url) or die $!; 
my $p = HTML::TreeBuilder->new_from_content($page); 
my($book); 
my($edition); 

my @links = $p->look_down(
    _tag => 'a', 
    href => qr{^ /Qhttp://www.oreilly.com/catalog/\E \w+ $}x 
); 

my @rows = map { $_->parent->parent } @links; 

my @books; 
for my $row (@rows) { 
    my %book; 
    my @cells = $row->look_down(_tag => 'td'); 
    $book{title} =$cells[0]->as_trimmed-text; 
    $book{price} =$cells[2]->as_trimmed-text; 
    $book{price} =~ s/^\$//; 

    $book{url}  = get_url($cells[0]); 
    $book{ebook} = get_url($cells[3]); 
    $book{safari} = get_url($cells[4]); 
    $book{examples} = get_url($cells[5]); 
    push @books, \%book; 
} 

sub get_url { 
    my $node = shift; 
    my @hrefs = $node->look_down(_tag => 'a'); 
    return unless @hrefs; 
    my $url = $hrefs[0]->atr('href'); 
    $url =~ s/\s+$//; 
    return $url; 
} 

$p = $p->delete; #we don't need this anymore. 

{ 
    my $count = 1; 
    my @perlbooks = sort { $a->{price} <=> $b->{price} } 
        grep { $_->{title} =~/perl/i } @books; 
    print $count++, "\t", $_->{price}, "\t", $_->{title} for @perlbooks; 
} 

{ 
    my @perlbooks = grep { $_->{title} =~ /perl/i } @books; 
    my @javabooks = grep { $_->{title} =~ /java/i } @books; 
    my $diff = @javabooks - @perlbooks; 
    print "There are "[email protected]" Perl books and "[email protected] " Java books. $diff more Java than Perl."; 
} 

for my $book ($books[34]) { 
    my $url = $book->{url}; 
    my $page = get($url); 
    my $tree = HTML::TreeBuilder->new_from_content($page); 
    my ($pubinfo) = $tree->look_down(
            _tag => 'span', 
            class => 'secondary2' 
    ); 
    my $html = $pubinfo->as_HTML; print $html; 
    my ($pages) = $html =~ /(\d+) pages/, 
    my ($edition) = $html =~ /(\d)(?:st|nd|rd|th) Edition/; 
    my ($date) = $html =~ /(\w+ (19|20)\d\d)/; 

    print "\n$pages $edition $date\n"; 

    my ($img_node) = $tree->look_down(
            _tag => 'img', 
            src => qr{^/catalog/covers/}, 
    ); 
    my $img_url = 'http://www.oreilly.com'.$img_node->attr('src'); 
    my $cover = get($img_url); 
    # now save $cover to disk 
} 

現在我得到這些錯誤,

裸詞「文本」不準,而在使用「嚴格潛艇」在./SpiderTutorial_19_06.pl線23 。 在./SpiderTutorial_19_06.pl第24行使用「strict subs」時,不允許使用Bareword「text」。 執行./SpiderTutorial_19_06.pl由於編譯錯誤而中止。

任何幫助將不勝感激。

+2

'use strict;使用警告;' – Ether 2010-06-26 00:57:52

回答

4

我不知道原來的程序,但最有可能的as_trimmed-text應該是as_trimmed_text

+0

或者as_trimmed->文本? – ysth 2010-06-26 01:49:53

+0

@ysth:我也這麼想,但發現'as_trimmed_text'更有可能。 HTML :: Element的文檔證實了它。 – musiKk 2010-06-26 15:12:29

3

問題是方法名稱as_trimmed-text。連字符不允許在perl的名字中使用。你可能意思是as_trimmed_text。現在它解析爲$cells[0]->as_trimmed() - text()

相關問題