2013-02-20 97 views
0

我想驗證一個Excel公式的風格,具有以下的正則表達式:驗證和分組Excel公式格式

=SUM\(((?:\w+\d+)(?::\w+\d+)?)((?:,\w+\d+)(?::\w+\d+)?)*\) 

在這個來源:

應該通過

=SUM(A1,A11:A212,A12:A56,A342:A12,A3) 
=SUM(A11:A12,A12:a12,A34:A3) 
=SUM(A1,A2,A3) 
=SUM(A1) 

應失敗

=SUM(A11:A212:A2,A12:A56,A4,A342:A12) 

我有驗證部分工作,但我無法弄清楚如何將每個逗號分組值。他們應該是:

我多麼希望他們進行分組:

=SUM(A1,A11:A12,A12:A56,A3)  // Groups: $1 = A1 $2 = A11:A12 $3 = A12:A56 $4 = A3 
=SUM(A11:A12,A10:A12,A34:A3) // Groups: $1 = A11:A12 $2 = A10:A12 $3 = A34:A3 
=SUM(A1,A2,A3)     //Groups: $1 = A1 $2 = A2 $3 = A3 
=SUM(A1)      //Groups: $1 = A1 

如何,他們目前正在分組:

=SUM(A1,A11:A12,A12:A56,A3)  // Groups: $1 = A1 $2 = A3 
=SUM(A11:A12,A10:A12,A34:A3) // Groups: $1 = A11:A12 $2 = A34:A3 
=SUM(A1,A2,A3)     //Groups: $1 = A1 $2 = A3 
=SUM(A1)      //Groups: $1 = A1 

通知,其分組的第一個和最後。我對REGEX很新,所以如果我在這裏做了一件很糟糕的事情,請指出我的方向。謝謝!

回答

1

這是不可能的:(...)(?:,(...))+(2組)總是會產生2場比賽,不管多少+匹配。

你需要做的是在(至少)兩個步驟:

value  := /\w+\d+(?::\w+\d+)?/ 

value_list := /value(?:,value)*/ 

expression := /=SUM\((value_list)\)/ 

現在從expression(該value_list)符合第1組,並找到所有value出現在這場比賽中。

快速預覽PHP:

$text = 'should pass 

=SUM(A1,A11:A212,A12:A56,A342:A12,A3) 
=SUM(A11:A12,A12:a12,A34:A3) 
=SUM(A1,A2,A3) 
=SUM(A1) 

should fail 

=SUM(A11:A212:A2,A12:A56,A4,A342:A12)'; 

$value  = "\w+\d+(?::\w+\d+)?"; 
$value_list = "$value(?:,$value)*"; 
$expression = "=SUM\(($value_list)\)"; 

preg_match_all("/$expression/", $text, $matches); 

// iterate over $value_list from $expression (group 1) 
foreach($matches[1] as $group1) { 
    preg_match_all("/$value/", $group1, $m); 
    print_r($m); 
} 

打印:

Array 
(
    [0] => Array 
     (
      [0] => A1 
      [1] => A11:A212 
      [2] => A12:A56 
      [3] => A342:A12 
      [4] => A3 
     ) 

) 
Array 
(
    [0] => Array 
     (
      [0] => A11:A12 
      [1] => A12:a12 
      [2] => A34:A3 
     ) 

) 
Array 
(
    [0] => Array 
     (
      [0] => A1 
      [1] => A2 
      [2] => A3 
     ) 

) 
Array 
(
    [0] => Array 
     (
      [0] => A1 
     ) 

)
0

我實際上會先分割字符串。喜歡的東西:

sub IsFormulaValid 
{ 
    my $str = $_[0]; 
    (my $match) = $str =~ /^=SUM\(([^)]+)\)$/; 
    my @sumArgs = split(/,\s*/, $match); 
    my $valid = 1; 
    foreach(@sumArgs){ 
     if($_ !~ /^[a-z]+\d+(?::[a-z]+\d+){0,1}$/i){ 
      $valid = 0; 
      last; 
     } 
    } 
    return $valid; 
} 

注意,你也可以查看比賽本身的有效性,並設置$valid當@sumArgs> 0。測試中使用的perl輸入:

my @testInput; 

push(@testInput,'=SUM(A1,A11:A212,A12:A56,A342:A12,A3)'); 
push(@testInput,'=SUM(A11:A12,A12:a12,A34:A3)'); 
push(@testInput,'=SUM(A1,A2,A3)'); 
push(@testInput,'=SUM(A1)'); 
push(@testInput,'=SUM(A11:A212:A2,A12:A56,A4,A342:A12)'); 

foreach(@testInput){ 
    print "'$_'\n "; 
    print 'NOT ' if !IsFormulaValid($_); 
    print "VALID\n\n"; 
} 

結果:

'=SUM(A1,A11:A212,A12:A56,A342:A12,A3)' 
    VALID 

'=SUM(A11:A12,A12:a12,A34:A3)' 
    VALID 

'=SUM(A1,A2,A3)' 
    VALID 

'=SUM(A1)' 
    VALID 

'=SUM(A11:A212:A2,A12:A56,A4,A342:A12)' 
    NOT VALID