2010-11-22 20 views
2

我正在尋找一個在PHP 5中使用preg_match_all的正則表達式,它允許我用逗號分割一個字符串,只要逗號不存在在單引號內,允許使用單引號。示例數據將是:REGEX:通過不用單引號的逗號分割,允許使用轉義引號

(some_array, 'some, string goes here','another_string','this string may contain "double quotes" but, it can\'t split, on escaped single quotes', anonquotedstring, 83448545, 1210597346 + '000', 1241722133 + '000') 

這應該產生匹配,看起來像這樣:

(some_array 

'some, string goes here' 

'another_string' 

'this string may contain "double quotes" but, it can\'t split, on escaped single quotes' 

anonquotedstring 

83448545 

1210597346 + '000' 

1241722133 + '000') 

我試過很多,很多的正則表達式...我現在的一個看起來是這樣的,雖然它不不能正確匹配100%。 (它仍然在單引號內部分開一些逗號。)

"/'(.*?)(?<!(?<!\\\)\\\)'|[^,]+/" 
+1

此*能*來完成,但是它是相當困難比大多數人想象的要多看起來你現在正在感受到困難。它真的沒有庫函數在PHP中處理這個問題嗎?有在Perl中。如果到那時你還沒有得到一個好的答案,我可能會試着把這個正則表達式放在一起給你。 – tchrist 2010-11-22 14:11:50

回答

7

你試過str_getcsv?它沒有正則表達式就能完全滿足你的需求。

$result = str_getcsv($str, ",", "'"); 

您可以從a comment在文檔甚至實現比5.3老版本PHP這種方法,映射到fgetcsv在這個片段:

if (!function_exists('str_getcsv')) { 

    function str_getcsv($input, $delimiter = ',', $enclosure = '"', $escape = null, $eol = null) { 
     $temp = fopen("php://memory", "rw"); 
     fwrite($temp, $input); 
     fseek($temp, 0); 
     $r = fgetcsv($temp, 4096, $delimiter, $enclosure); 
     fclose($temp); 
     return $r; 
    } 

} 
+1

此解決方案工作。 str_getcsv不是一個有效的函數,因爲我沒有運行PHP 5.3+ – JordanL 2010-11-22 15:12:37

+0

不幸的是,str_getcsv與單引號內逗號的處理方式不一致:http://3v4l.org/Ubk1U – greggles 2015-01-22 20:13:58

+1

@greggles:我不知道任何可以將單引號作爲字符串封裝的解釋。它也不在[RFC 4180](http://tools.ietf.org/html/rfc4180)中,但PHP允許您根據文檔將機箱設置爲單引號。 – 2015-01-23 09:37:07

2

在PHP 5.3起,可以保存自己與痛str_getcsv

$data=str_getcsv($input, ",", "'"); 

要採取你的榜樣......

$input=<<<STR 
(some_array, 'some, string goes here','another_string','this string may contain "double quotes" but it can\'t split on escaped single quotes', anonquotedstring, 83448545, 1210597346 + '000', 1241722133 + '000') 
STR; 

$data=str_getcsv($input, ",", "'"); 
print_r($data); 

輸出這個

Array 
(
    [0] => (some_array 
    [1] => some, string goes here 
    [2] => another_string 
    [3] => this string may contain "double quotes" but it can\'t split on escaped single quotes 
    [4] => anonquotedstring 
    [5] => 83448545 
    [6] => 1210597346 + '000' 
    [7] => 1241722133 + '000') 
) 
+1

基督....我覺得啞巴,大聲笑...一直在做PHP編碼八年,從未使用過這個功能。 – JordanL 2010-11-22 14:32:39

0

我第二次使用CSV解析器這裏的,那他們就是在那裏。

如果你堅持用正則表達式,你可以使用

preg_match_all(
    '/\s*" # either match " (optional preceding whitespace), 
    (?:\\\\. # followed either by an escaped character 
    |  # or 
    [^"]  # any character except " 
    )*  # any number of times, 
    "\s*  # followed by " (and optional whitespace). 
    |   # Or: do the same thing for single-quoted strings. 
    \s*\'(?:\\\\.|[^\'])*\'\s* 
    |   # Or: 
    [^,]*  # match anything except commas (i.e. any remaining unquoted strings) 
    /x', 
    $subject, $result, PREG_PATTERN_ORDER); 
$result = $result[0]; 

但是,正如你所看到的,這是醜陋和難以維持。使用正確的工具來完成這項工作。

2

隨着一些向後看,你可以得到一些接近你想要什麼:

$test = "(some_array, 'some, string goes here','another_string','this string may contain \"double quotes\" but, it can\'t split, on escaped single quotes', anonquotedstring, 83448545, 1210597346 + '000', 1241722133 + '000')"; 
preg_match_all('` 
(?:[^,\']| 
    \'((?<=\\\\)\'|[^\'])*\')* 
`x', $test, $result); 
print_r($result); 

爲您提供了這樣的結果:

Array 
(
    [0] => Array 
     (
      [0] => (some_array 
      [1] => 
      [2] => 'some, string goes here' 
      [3] => 
      [4] => 'another_string' 
      [5] => 
      [6] => 'this string may contain "double quotes" but, it can\'t split, on escaped single quotes' 
      [7] => 
      [8] => anonquotedstring 
      [9] => 
      [10] => 83448545 
      [11] => 
      [12] => 1210597346 + '000' 
      [13] => 
      [14] => 1241722133 + '000') 
      [15] => 
     ) 

    [1] => Array 
     (
      [0] => 
      [1] => 
      [2] => e 
      [3] => 
      [4] => g 
      [5] => 
      [6] => s 
      [7] => 
      [8] => 
      [9] => 
      [10] => 
      [11] => 
      [12] => 0 
      [13] => 
      [14] => 0 
      [15] => 
     ) 

) 
+0

具有「e g s 0」的第二個數組是什麼?你的意圖是被扔掉嗎? – greggles 2015-01-22 20:17:21

相關問題