我試圖從EnviroCanada天氣頁面中提取以下內容。如何解析網頁
我想按照以下每個小時獲得。
時間|大腿| Tlow |溼度
7:00 | 23 | 22.9 | 30
提取HTML頁:
<tr>
<td headers="header1" class="text-center vertical-center"> 7:00 </td>
<td headers="header2" class="media vertical-center"><span class="pull-left"><img class="media-object" height="35" width="35" src="/weathericons/small/02.png" /></span><div class="visible-xs visible-sm">
<br />
<br />
</div>
<div class="media-body">
<p>Partly Cloudy</p>
</div>
</td>
<td headers="header3m" class=" metricData text-center vertical-center">23
�(22.9)
</td>
<td headers="header3i" class=" imperialData hidden text-center vertical-center">73
�(73.2)
</td>
<td headers="header4m" class="metricData text-center vertical-center">
<abbr title="West-Northwest">WNW</abbr> 8</td>
<td headers="header4i" class="imperialData hidden text-center vertical-center">
<abbr title="West-Northwest">WNW</abbr> 5</td>
<td headers="header6" class="metricData text-center vertical-center">30</td>
<td headers="header6" class="imperialData hidden text-center vertical-center">87</td>
<td headers="header7" class="text-center vertical-center">83</td>
<td headers="header8" class="metricData text-center vertical-center">20</td>
<td headers="header8" class="imperialData hidden text-center vertical-center">68</td>
<td headers="header9m" class="metricData text-center vertical-center">100.7</td>
<td headers="header9i" class="imperialData hidden text-center vertical-center">29.7</td>
<td headers="header10" class="metricData text-center vertical-center">24</td>
<td headers="header10" class="imperialData hidden text-center vertical-center">15</td>
</tr>
到目前爲止的代碼:
use strict;
use warnings;
use LWP::Simple;
use HTML::TokeParser;
my $url = "http://weather.gc.ca/past_conditions/index_e.html?station=yyz";
my $page = get($url) ||
die "Could not load URL\n";
my $parser = HTML::TokeParser->new(\$page) ||
die "Parse error\n";
$parser->get_tag("td") foreach();
$parser->get_tag("");
my $time = $parser->get_text();
??
my $thigh = $parser->get_text();
???
my $tlow = $parser->get_text();
???
my $humid = $parser->get_text();
我完全失去了這裏
[HTML :: TableExtract非常有用](https://www.nu42。COM/2012/04/htmltableextract-IS-beautiful.html)。 –
我喜歡Mojo :: DOM從HTML頁面獲取東西,非常好用。 – asjo