2016-12-14 47 views
1

我使用$ ua從我的$ url =「http://finance.yahoo.com/quote/MSFT?p=MSFT」中獲取一些HTML;試圖解析使用Mojo :: DOM,沒有得到標記正確

我可以從URL中獲取HTML內容。然後我使用Mojo :: DOM進行子分析,那是正確的一步,對吧?我想另外的條從魔$網址,以獲取()html內容的A HREF ...這是我有:

my $ua = Mojo::UserAgent->new(max_redirects => 5, timeout => $timeout); 
my $dom = Mojo::DOM->new; 

my $content = $ua->get($url)->res->dom->at('div#quoteNewsStream-0-Stream')->content; 
my $content2 = $content->$dom->find('a href#'); 

回答

2

只要使用由Mojo::UserAgent返回Mojo::DOM

#!/usr/bin/env perl 

use strict; 
use warnings; 
use v5.10; 

use Mojo::UserAgent; 

my $url = "http://finance.yahoo.com/quote/MSFT?p=MSFT"; 

my $dom = Mojo::UserAgent->new->get($url)->res->dom; 

my $stream = $dom->at('div#quoteNewsStream-0-Stream'); 

for my $href ($stream->find('a')->each) { 
    say $href->{href}; 
} 

輸出:

/news/jeff-bezos-trump-tech-summit-was-very-productive-224326329.html 
/news/jeff-bezos-trump-tech-summit-was-very-productive-224326329.html 
/news/donald-trump-tech-summit-at-trump-tower-202517070.html 
/video/microsoft-surface-sales-surge-disappointment-181934121.html 
/news/jeff-bezos-trump-tech-summit-was-very-productive-224326329.html 
/news/microsoft-surface-sales-surge-on-disappointment-with-macbook-pro-163819168.html 
/news/microsoft-surface-sales-surge-on-disappointment-with-macbook-pro-163819168.html 
/m/7f581deb-0089-341a-b637-e1e979e9e210/ss_5-point-checklist-for.html 

有關使用這些工具的8分鐘教程,請Mojocast Episode 5

+0

感謝您關於檢查Mojocast的建議。非常感謝ty很多。 –