2015-10-16 12 views
0

輸入作爲GMF文件:perl的正則表達式模式匹配

CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|7500000|234446 

在Perl代碼,我使用的下面以從線

if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|.*\|(.*?)$/) 
{ 
    $tag=$1; 
    $lineTxt=$2; 
    $usage = $3; 
    $amt = $4; 
} 

輸出提取的字符串:

tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Non-Smartphone Package usage :: 3126 amt :: 
tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Smartphone Package usage :: 3126 amt :: 
tag :: CUSTEVSUMMROW_GPRS lineTxt :: GPRS - Nova Subscriber Non-Smartphone Package - Charged usage :: 3126 amt :: 234446 

如何檢索/打印使用的單位是MB或GB。任何人都可以幫助我。

回答

1

你有什麼有:

if($line=~m/^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|(.*?)\|(.*?)$/) 
{ 
    $tag=$1; 
    $lineTxt=$2; 
    $usage = $3; 
    $units = $4; 
    $amt = $5; 
} 

但我建議是不是解決此問題的最佳方法 - 我會考慮使用split並分別處理您的第一個字段。

事情是這樣的,也許:

#!/usr/bin/env perl 
use strict; 
use warnings; 

use Data::Dumper; 

my @fields = qw (tag lineTxt usage units amt); 

while (<DATA>) { 
    my ($first_field, @record) = split '\|'; 

    #split the first field on _just_ the first space. 
    unshift(@record, $first_field =~ m/^(\w+) (.*)$/); 

    #use a hash slice to put that record into a hash of named keys. 
    my %data; 
    @data{@fields} = @record; 
    print Dumper \%data; 

    # can of course, make this an array of hashes quite easily. 
} 


__DATA__ 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Smartphone Package|3126|GB| | 
CUSTEVSUMMROW_GPRS GPRS - Nova Subscriber Non-Smartphone Package - Charged|3126|GB|7500000|234446 

這將打印每個記錄爲:

$VAR1 = { 
      'units' => 'GB', 
      'tag' => 'CUSTEVSUMMROW_GPRS', 
      'amt' => '7500000', 
      'usage' => '3126', 
      'lineTxt' => 'GPRS - Nova Subscriber Non-Smartphone Package - Charged' 
     }; 
+0

@ Sobrique..Thank你這麼多.. – RAVJI

3

您不會在\d+之後捕獲該列。添加括號來做到這一點。

.*貪婪,即它匹配儘可能多。添加?,使之節儉

if ($line =~ /^(CUSTEVSUMMROW_GPRS|CUSTEVSUMMROW).*?\s(.*?)\|(\d+)\|(.*?)\|/) 

您也可以重寫在可以選擇的

(CUSTEVSUMMROW(?:_GPRS)?)