不能使用字符串作爲哈希引用..？

我想解析一個網絡索引程序的HTML文檔。爲此，我使用HTML::TokeParser。不能使用字符串作爲哈希引用..？

我對我的第一個if語句的最後一行得到一個錯誤：

if ($token->[1] eq 'a') { 
    #href attribute of tag A 
    my $suffix = $token->[2]{href};

，說Can't use string ("<./a>") as a HASH ref while "strict refs" in use at ./indexer.pl line 270, <PAGE_DIR> line 1.

是我的問題是（？後綴或<./a>）是一個字符串，需要變成一個哈希引用？我查看了其他有類似錯誤的帖子......但我仍然對此一無所知。謝謝你的幫助。

sub parse_document { 

    #passed from input 
    my $html_filename = $_[0]; 

    #base url for links 
    my $base_url = $_[1]; 

    #created to hold tokens 
    my @tokens =(); 

    #created for doc links 
    my @links =(); 

    #creates parser 
    my $p = HTML::TokeParser->new($html_filename); 

    #loops through doc tags 
    while (my $token = $p->get_token()) { 
     #code for retrieving links 
     if ($token->[1] eq 'a') { 
      # href attribute of tag A 
      my $suffix = $token->[2]{href}; 

      #if href exists & isn't an email link 
      if (defined($suffix) && !($suffix =~ "^mailto:")) { 
       #make the url absolute 
       my $new_url = make_absolute_url $base_url, $suffix; 

       #make sure it's of the http:// scheme 
       if ($new_url =~ "^http://"){ 
        #normalize the url 
        my $new_normalized_url = normalize_url $new_url; 

        #add it to links array 
        push(@links, $new_normalized_url); 
       } 
      } 
     } 

     #code for text words 
     if ($token->[0] eq 'T') { 
      my $text = $token->[1]; 

      #add words to end of array 
      #(split by non-letter chars) 
      my @words = split(/\P{L}+/, $text); 
     } 
    } 

    return (\@tokens, \@links); 
}

來源

2011-10-31 mdegges

我會打印出一些調試語句，看看到底它認爲令牌要通過數據::自卸車（$令牌），也見$ token - > [1]是什麼。這可能是一個'或類似的東西搞亂了價值觀。 – scrappedcola

get_token()方法返回一個數組，其中$token->[2]是包含您的href的哈希引用，僅當$token->[0]是S（即，開始標記）時。在這種情況下，您匹配的是結束標籤（其中$token->[0]是E）。詳情請參閱PerlDoc。

要修復，在你的循環頂部添加

next if $token->[0] ne 'S';

。

來源

2011-10-31 19:37:06

謝謝！我以爲我可以忽略開始標籤的檢查，因爲我並不真正瞭解它的用途......但我想在這裏需要使用休息時間。 – mdegges

顯然$token->[2]被解析爲散列基準，其值是"</a>"。當然不希望你想要！

來源

2011-10-31 19:34:13 ennuikiller

實際上'$ token - > [2]'是一個字符串（'「」'），他試圖*使用它作爲散列引用。 –

@Brian是的，謝謝你的更正！ – ennuikiller

$token->[2]是一個字符串，而不是一個散列引用。

做一個print $token->[2]，你會看到它是包含字符串</a>

來源

2011-10-31 19:39:06

不能使用字符串作爲哈希引用..？

回答

相關問題