2012-04-02 51 views
10

我與建立一個正則表達式解析這種字符串(聖經經文)掙扎:PHP的preg_match聖經經文格式

'John 14:16–17, 25–26' 
    'John 14:16–17' 
    'John 14:16' 
    'John 14' 
    'John' 

所以基本模式是:

Book [[Chapter][:Verse]]

哪裏章節和詩節是可選的。

+0

所以它應該匹配,即使它只是書的名字?你有一本書應該匹配的清單嗎?否則,它只會匹配每個字。 – JJJ 2012-04-02 09:36:41

+0

只要匹配任何單詞,真正的問題是我有這麼多的可選部分。 – Dziamid 2012-04-02 09:41:01

回答

4

試試這個位置

\b[a-zA-Z]+(?:\s+\d+)?(?::\d+(?:–\d+)?(?:,\s*\d+(?:–\d+)?)*)? 

看到和測試here on Regexr

因爲(?:,\s*\d+(?:–\d+)?)*在年底可以有經文列表,經文範圍結尾。

+0

你是最普通的一個。我只添加了'[ - ]'而不是@Robby建議的連字符,並且一些捕獲括號使它完美。 – Dziamid 2012-04-02 09:59:52

3

使用這個表達式:

[A-Za-z]+(([0-9]+)(:[0-9]+)?([\-–][0-9]+)?(, [0-9]+[\-–][0-9]+)?)? 

或者在它的 '漂亮' 的版本:

\w+((\d+)(:\d+)?([\-–]\d+)?(, \d+[\-–]\d+)?)? 

更新:要匹配破折號或連字符


NOTE:我測試過它,它匹配所有5個可能的版本。

例子:http://regexr.com?30h4q

enter image description here

9

我認爲這確實你需要:

\w+\s?(\d{1,2})?(:\d{1,2})?([-–]\d{1,2})?(,\s\d{1,2}[-–]\d{1,2})? 

假設:

  • 的數字總是在套1或2位數字
  • 破折號將匹配以下-

下面是正則表達式與評論:

" 
\w   # Match a single character that is a 「word character」 (letters, digits, and underscores) 
    +   # Between one and unlimited times, as many times as possible, giving back as needed (greedy) 
\s   # Match a single character that is a 「whitespace character」 (spaces, tabs, and line breaks) 
    ?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
(   # Match the regular expression below and capture its match into backreference number 1 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
)?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
(   # Match the regular expression below and capture its match into backreference number 2 
    :   # Match the character 「:」 literally 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
)?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
(   # Match the regular expression below and capture its match into backreference number 3 
    [-–]  # Match a single character present in the list 「-–」 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
)?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
(   # Match the regular expression below and capture its match into backreference number 4 
    ,   # Match the character 「,」 literally 
    \s   # Match a single character that is a 「whitespace character」 (spaces, tabs, and line breaks) 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
    [-–]  # Match a single character present in the list 「-–」 
    \d   # Match a single digit 0..9 
     {1,2}  # Between one and 2 times, as many times as possible, giving back as needed (greedy) 
)?   # Between zero and one times, as many times as possible, giving back as needed (greedy) 
" 

這裏是在PHP中的一些用法示例:

if (preg_match('/\w+\s?(\d{1,2})?(:\d{1,2})?([-–]\d{1,2})?(,\s\d{1,2}[-–]\d{1,2})?/', $subject)) { 
    # Successful match 
} else { 
    # Match attempt failed 
} 

獲取給定字符串中所有匹配的數組

preg_match_all('/\w+\s?(\d{1,2})?(:\d{1,2})?([-–]\d{1,2})?(,\s\d{1,2}[-–]\d{1,2})?/', $subject, $result, PREG_PATTERN_ORDER); 
$result = $result[0]; 
+0

所以它會匹配破折號或連字符? – Dziamid 2012-04-02 09:47:11

+0

是的,這是正確的嗎? – Robbie 2012-04-02 09:48:26

+0

爲此+1,謝謝 – Dziamid 2012-04-02 10:22:18

0
([1|2|3]?([i|I]+)?(\s?)\w+(\s+?))((\d+)?(,?)(\s?)(\d+))+(:?)((\d+)?([\-–]\d+)?(,(\s?)\d+[\-–]\d+)?)? 

適用於幾乎每本書...

0
(\b[a-zA-Z]\w+\s\d+)(:\d+)+([-–]\d+)?([,;](\s)?(\d+:)?\d+([-–]\d+)?)? 

這是這裏介紹的所有代碼的混合。唯一的格式,它不會強調是「書名只」或「書&節只」(剛加入「:1,所有」後章#)我發現提供給有資格太多變化的其他代碼,不符合聖經經文的語法。

這些是我在RegExr測試的例子:(不能發表圖片尚未)

約翰 洪堡14:16-17,25-26
約14: 16-17
約翰14:16
約翰77:3; 2:9-11
約5:1-所有 布拉德555-783-6867
約翰6
您好你怎麼樣
斯拉32:5約14 :16-17,25-36
23時34分
約14:16-17,25-36
約翰福音14:16-17; 32:25