只是爲了好玩,這裏的生成必須匹配長的列表模式的更有趣的方法:
#!/usr/bin/env perl
use Regexp::Assemble;
my $ra = Regexp::Assemble->new;
foreach (@ARGV) {
$ra->add($_);
}
print $ra->re, "\n";
保存,作爲「regexp_assemble.pl
」,安裝Perl的Regexp::Assemble模塊,然後運行:
perl ./regexp_assemble.pl one two three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen twenty thirty forty fifty sixty seventy eighty ninety hundred thousand million ' ' '\.' ',' '?' '!'
你應該可以看到這個生成:
(?^:(?:[ !,.?]|t(?:h(?:irt(?:een|y)|ousand|ree)|w(?:e(?:lve|nty)|o)|en)|f(?:o(?:ur(?:teen)?|rty)|i(?:ft(?:een|y)|ve))|s(?:even(?:t(?:een|y))?|ix(?:t(?:een|y))?)|e(?:ight(?:een|y)?|leven)|nine(?:t(?:een|y))?|hundred|million|one))
這是Perl的版本模式,它需要一些小的調整,以滿足您的要求:刪除前導?^:
及其周邊括號,加尾+
和靈活性,使其不區分大小寫:
pattern = /(?:[ !,.?]|t(?:h(?:irt(?:een|y)|ousand|ree)|w(?:e(?:lve|nty)|o)|en)|f(?:o(?:ur(?:teen)?|rty)|i(?:ft(?:een|y)|ve))|s(?:even(?:t(?:een|y))?|ix(?:t(?:een|y))?)|e(?:ight(?:een|y)?|leven)|nine(?:t(?:een|y))?|hundred|million|one)+/i
下面是一些scan
結果:
'one dollar'.scan(pattern) # => ["one "]
'one million dollars'.scan(pattern) # => ["one million "]
'one million three hundred dollars'.scan(pattern) # => ["one million three hundred "]
'one million, three hundred!'.scan(pattern) # => ["one million, three hundred!"]
'one million, three hundred and one dollars'.scan(pattern) # => ["one million, three hundred ", " one "]
不幸的是,Ruby並不等同於Perl的Regexp::Assemble模塊。這對於這類任務非常有用,因爲Ruby中的正則表達式引擎速度非常快。
的唯一的缺點是它的拍攝前後的空格,但是這很容易通過在字符串中使用map(&:strip)
固定:
'one million, three hundred and one dollars'.scan(pattern).map(&:strip) # => ["one million, three hundred", "one"]
爲什麼你沒有'和'在列表中? – sawa
@sawa對不起,我不明白。我應該在哪裏放置「和」 –
要麼在列表中,要麼在正則表達式中。沒有放過它,你希望誰能抓住'二百八十''? – sawa