2013-08-20 297 views
2

正則表達式並不是我最強的套裝,我在這種情況下遇到了一些麻煩。從括號,括號和連字符的字符串中獲取子字符串

我有以下字符串:

locale (district - town) [parish] 

我需要提取以下信息: 1 - 區域設置 2 - 區 3 - 鎮

而且我有這些解決方案:

1 - 區域

preg_match("/([^(]*)\s/", $input_line, $output_array); 

2 - 區

preg_match("/.*\(([^-]*)\s/", $input_line, $output_array); 

3 - 鎮

preg_match("/.*\-\s([^)]*)/", $input_line, $output_array); 

而這些似乎很好地工作。 但是,字符串可以呈現像任何這些:

localeA(localeB) (district - town) [parish] 
locale (district - townA(townB)) [parish] 
locale (district - townA-townB) [parish] 

區域設置還可以包括其自身的括號內。 城鎮可以包括括號和/或自己的連字符。

這使得很難提取正確的信息。在3個場景上面,我將不得不提取:

localeA(localeB)+小區+鎮

區域+小區+ townA(townB)

區域+小區+ townA-townB

我發現很難處理所有這些情況。你能幫我嗎?

在此先感謝

+3

[?GOT速度(http://regex101.com/r/xS9fZ1) – HamZa

+0

@Hamza:爲什麼評論爲什麼不回答? – anubhava

+0

@anubhava我忙於其他事情,這是一個快速小提琴。如果我發佈這個答案,我至少應該提供一些解釋。 – HamZa

回答

0

如果語言環境,區,鎮沒有空間在其中:

preg_match("/^\s*(\S+)\s*\((\S+)\s*-\s*(\S+)\)/", $input_line, $output_array); 

解釋:

The regular expression: 

(?-imsx:^\s*(\S+)\s*\((\S+)\s*-\s*(\S+)\)) 

matches as follows: 

NODE      EXPLANATION 
---------------------------------------------------------------------- 
(?-imsx:     group, but do not capture (case-sensitive) 
         (with^and $ matching normally) (with . not 
         matching \n) (matching whitespace and # 
         normally): 
---------------------------------------------------------------------- 
^      the beginning of the string 
---------------------------------------------------------------------- 
    \s*      whitespace (\n, \r, \t, \f, and " ") (0 or 
          more times (matching the most amount 
          possible)) 
---------------------------------------------------------------------- 
    (      group and capture to \1: 
---------------------------------------------------------------------- 
    \S+      non-whitespace (all but \n, \r, \t, \f, 
          and " ") (1 or more times (matching the 
          most amount possible)) 
---------------------------------------------------------------------- 
)      end of \1 
---------------------------------------------------------------------- 
    \s*      whitespace (\n, \r, \t, \f, and " ") (0 or 
          more times (matching the most amount 
          possible)) 
---------------------------------------------------------------------- 
    \(      '(' 
---------------------------------------------------------------------- 
    (      group and capture to \2: 
---------------------------------------------------------------------- 
    \S+      non-whitespace (all but \n, \r, \t, \f, 
          and " ") (1 or more times (matching the 
          most amount possible)) 
---------------------------------------------------------------------- 
)      end of \2 
---------------------------------------------------------------------- 
    \s*      whitespace (\n, \r, \t, \f, and " ") (0 or 
          more times (matching the most amount 
          possible)) 
---------------------------------------------------------------------- 
    -      '-' 
---------------------------------------------------------------------- 
    \s*      whitespace (\n, \r, \t, \f, and " ") (0 or 
          more times (matching the most amount 
          possible)) 
---------------------------------------------------------------------- 
    (      group and capture to \3: 
---------------------------------------------------------------------- 
    \S+      non-whitespace (all but \n, \r, \t, \f, 
          and " ") (1 or more times (matching the 
          most amount possible)) 
---------------------------------------------------------------------- 
)      end of \3 
---------------------------------------------------------------------- 
    \)      ')' 
---------------------------------------------------------------------- 
)      end of grouping 
---------------------------------------------------------------------- 
0

不知道究竟你的規則和邊緣案件是,但這適用於提供的例子

preg_match('#^(.+?) \((.+?) - (.+?)\) \[(.+)\]$#',$str,$matches); 

給出了這些結果(當在$str每個示例串運行):

Array 
(
    [0] => locale (district - town) [parish] 
    [1] => locale 
    [2] => district 
    [3] => town 
    [4] => parish 
) 

Array 
(
    [0] => localeA(localeB) (district - town) [parish] 
    [1] => localeA(localeB) 
    [2] => district 
    [3] => town 
    [4] => parish 
) 

Array 
(
    [0] => locale (district - townA(townB)) [parish] 
    [1] => locale 
    [2] => district 
    [3] => townA(townB) 
    [4] => parish 
) 

Array 
(
    [0] => locale (district - townA-townB) [parish] 
    [1] => locale 
    [2] => district 
    [3] => townA-townB 
    [4] => parish 
)