其他人已經指出,您希望/s
選項使.
匹配換行符,以便您可以將邏輯行邊界與.*
交叉。您可能還需要非貪婪.*?
:
use v5.10;
my $html = <<'HTML';
<td class="fieldLabel" height="18">Activation Date:</td>
<td class="dataEntry" height="18">
10/27/2011
</td>
HTML
my $regex = qr|
<td.*?>Activation \s+ Date:</td>
\s*
<td.*?class="dataEntry".*?>\s*
(\S+)
\s*</td>
|xs;
if ($html =~ $regex) {
say "matched: $1";
}
else {
say "mismatched!";
}
如果你有完整的表,它更容易使用的東西,它知道如何解析表。讓一個模塊,如還有HTML::TableParser處理所有的細節:
use v5.10;
my $html = <<'HTML';
<table>
<tr>
<td class="fieldLabel" height="18">Activation Date:</td>
<td class="dataEntry" height="18">
10/27/2011
</td>
</tr>
</table>
HTML
use HTML::TableParser;
sub row {
my($tbl_id, $line_no, $data, $udata) = @_;
return unless $data->[0] eq 'Activation Date';
say "Date is $data->[1]";
}
# create parser object
my $p = HTML::TableParser->new(
{ id => 1, row => \&row, }
{ Decode => 1, Trim => 1, Chomp => 1, }
);
$p->parse($html);
還有HTML::TableExtract:
use v5.10;
my $html = <<'HTML';
<table>
<tr>
<td class="fieldLabel" height="18">Activation Date:</td>
<td class="dataEntry" height="18">
10/27/2011
</td>
</tr>
</table>
HTML
use HTML::TableExtract;
my $p = HTML::TableExtract->new;
$p->parse($html);
my $table_tree = $p->first_table_found;
my $date = $table_tree->cell(0, 1);
$date =~ s/\A\s+|\s+\z//g;
say "Date is $date";
[他來的小馬...](http://stackoverflow.com/a/1732454/554546) – 2012-03-08 18:39:20
總之,請參閱[Tchrist的迴應](http://stackoverflow.com/questions/4231382/regular-expression-pattern-not-matching-anywhere-in-string) – JRFerguson 2012-03-08 19:03:11
@JFFerguson:我想我也在那裏做客串秀:-) – 2012-03-08 20:45:48