2017-09-14 50 views
1

我有這個正則表達式模式,我試圖找出一個句子(字符串)是否與它匹配。C#正則表達式 - 從可重複組中獲取值

我的模式:

@"^A\s(?<TERM1>[A-Z][a-z]{1,})\sconsists\sof\s((?<MINIMUM1>(\d+))\sto\s(?<MAXIMUM1>(\d+|many){1})|(?<MINMAX1>(\d+|many{1}){1}){1})\s(?<TERM2>[A-Z][a-z]{1,})(\sand\s((?#********RepeatablePart********)(?<MININUM2>(\d+))\sto\s(?<MAXIMUM2>(\d+|many){1})|(?<MINMAX2>(\d+|many{1}){1}){1})\s(?<TERM3>([A-Z][a-z]{1,})))+\.$" 

如何閱讀我的模式:

A (TERM1) consists of (MINIMUM1 to (MAXIMUM1|many)|(MINMAX1|many)) (TERM2) ((?#********RepeatablePart********)and (MINIMUM2 to (MAXIMUM2|many)|(MINMAX|many)) (TERM3))+. 

MINMAX1/MINMAX2可以是數字,或只是字 '多' 和MINIMUM1/MINIMUM2是一個數字, MAXIMUM1/MAXIMUM2可能是一個數字或'many'這個詞。

範例語句:

  1. 轎廂由2至5座位和1 Breakpedal和1 Gaspedal和4至6級的Windows。
  2. 一棵樹由許多蘋果和2到多種顏色和0到1松鼠和許多樹葉組成。
  3. 一本書由1到很多作者和1個標題和3個書籤組成。

    1. 將包含:TERM1 =汽車,MINIMUM1 = 2,MAXIMUM1 = 5,MINMAX1 = NULL,TERM2 =座椅,MINIMUM2 = NULL,MAXIMUM2 = NULL,MINMAX2 = 1,TERM3 = Breakpedal,MINIMUM2 = NULL,MAXIMUM2 = null,MINMAX2 = 1,TERM3 = Gaspedal,MINIMUM2 = 4,MAXIMUM2 = 6,MINMAX2 = null,TERM3 = Windows
    2. 將包含:TERM1 = Tree,MINIMUM1 = null,MAXIMUM1 = null,MINMAX1 = many,TERM2 =蘋果,MINIMUM2 = 2,MAXIMUM2 =許多,MINMAX2 =空,TERM3 =顏色,MINIMUM2 = 0,MAXIMUM2 = 1,MINMAX2 =空,TERM3 =松鼠,MINIMUM2 =空,MAXIMUM2 =空,MINMAX2 =很多,TERM3 =離開
    3. 將包含:TERM1 =書,MINIMUM1 = 1,MAXIMUM1 =許多,MINMAX1 = null,TERM2 =作者,MIN IMUM2 = NULL,MAXIMUM2 = NULL,MINMAX2 = 1,TERM3 =標題,MINIMUM2 = NULL,MAXIMUM2 = NULL,MINMAX2 = 3,TERM3 =書籤

我創建了我想補一類與重複的部分在我的字符串值(MINIMUM2,MAXIMUM2,MINMAX和TERM3發言):

//MyObject contains the values of one expression from the repateatable part. 
public class MyObject 
{ 
    public string term { get; set; } 
    public string min { get; set; } 
    public string max { get; set; } 
    public string minmax { get; set; } 
} 

由於我的圖案具有重複的部分(+)我想創建一個List,我添加了一個新的對象(MyObject),我想填寫可複製組的值。

我的問題是我不知道如何填充我的對象與我的可重複部分的值。我嘗試編寫代碼的方式是錯誤的,因爲我的列表自從有一個 句子(例如'一本書包含1到多個作者和1個標題和3個書籤'),沒有相同數量的值。)從來沒有一個MINIMUM2 ,每個可重複部分有一個MAXIMUM2和一個MINMAX2。

有沒有更簡單的方法來填充我的對象或我如何從我的量詞部分獲取值?

我的代碼(在C#):

var match = Regex.Match(exampleText, pattern); 
if (match.Success) 
{ 

    string term1 = match.Groups["TERM1"].Value; 
    string minimum1 = match.Groups["MINIMUM1"].Value; 
    string maximum1 = match.Groups["MAXIMUM1"].Value; 
    string minmax1 = match.Groups["MINMAX1"].Value; 
    string term2 = match.Groups["TERM2"].Value; 

    //--> Groups[].Captures..ToList() might be wrong. Maybe there is a better way to get the values of the reapeatable Part 
    List<string> minimums2 = match.Groups["MINIMUM2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>(); 
    List<string> maximums2 = match.Groups["MAXIMUM2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>(); 
    List<string> minmaxs2 = match.Groups["MINMAX2"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>(); 
    List<string> terms3 = match.Groups["TERM3"].Captures.Cast<Capture>().Select(x => x.Value).ToList<string>(); 

    List<MyObject> myList = new List<MyObject>(); 

    for (int i = 0; i<terms3.Count; i++) 
    { 
     myList.Add(new MyObject() 
      { 
      term = terms3[i], 
      min = minimums2[i] //-->ERROR MIGHT HAPPEN when List<string>minimums2 doesn't have the same amount of values like List<string> terms3 
      max = maximums2[i] //-->ERROR.. 
      minmax = minmaxs2[i] //-->ERROR... 
      }); 
    } 
} 

回答

0

我可以用這個詞「和」讓我有一個字符串「splittedText」其中包含的每一個短語後,分割我exampleText解決我自己我的問題可重複的部分我的模式。

string[] splittedText = Regex.Split(exampleText, @"\sand\s"); 

分裂我exampleText之後我插入每個單獨的短語,myObject的值在一個for循環,我做的是另regex.match得到我需要從每個短語的值。

string pattern2 =(((?#********RepeatablePart********)(?<MININUM2>(\d+))\sto\s(?<MAXIMUM2>(\d+|many){1})|(?<MINMAX2>(\d+|many{1}){1}){1})\s(?<TERM3>([A-Z][a-z]{1,})))+\.$ 
List<MyObject> myList = new List<MyObject>(); 

//i = 1 -> since splittedText[0] contains the beginning of the sentence (e.g. 'A Car consists of 2 to 5 Seats') 
for (int i = 1; i<splittedText.Count(); i++) 
{     
    var match2 = Regex.Match(splittedText[i], pattern2); 
    if (match2.Success) 
    {      
     myList.Add(new MyObject() 
     { 
      term = match2.Groups["TERM3"].Value,    
      min = match2.Groups["MININUM2"].Value, 
      max = match2.Groups["MAXIMUM2"].Value, 
      minmax = match2.Groups["MINMAX2"].Value 
     }); 

    } 
}