查找具有特殊字符

string emailBody = " holla holla testing is for NewFinancial History:\"xyz\" dsd NewFinancial History:\"abc\" NewEBTDI$:\"abc\" dsds "; 

    emailBody = string.Join(" ", Regex.Split(emailBody.Trim(), @"(?:\r\n|\n|\r)")); 
       var keys = Regex.Matches(emailBody, @"\bNew\B(.+?):", RegexOptions.Singleline).OfType<Match>().Select(m => m.Groups[0].Value.Replace(":", "")).Distinct().ToArray(); 
       foreach (string key in keys) 
       { 
        List<string> valueList = new List<string>(); 
        string regex = "" + key + ":" + "\"(?<" + GetCleanKey(key) + ">[^\"]*)\""; 

        var matches = Regex.Matches(emailBody, regex, RegexOptions.Singleline); 
        foreach (Match match in matches) 
        { 
         if (match.Success) 
         { 
          string value = match.Groups[GetCleanKey(key)].Value; 
          if (!valueList.Contains(value.Trim())) 
          { 
           valueList.Add(value.Trim()); 
          } 
         } 
        } 

public string GetCleanKey(string key) 
     { 
      return key.Replace(" ", "").Replace("-", "").Replace("#", "").Replace("$", "").Replace("*", "").Replace("!", "").Replace("@", "") 
       .Replace("%", "").Replace("^", "").Replace("&", "").Replace("(", "").Replace(")", "").Replace("[", "").Replace("]", "").Replace("?", "") 
       .Replace("<", "").Replace(">", "").Replace("'", "").Replace(";", "").Replace("/", "").Replace("\"", "").Replace("+", "").Replace("~", "").Replace("`", "") 
       .Replace("{", "").Replace("}", "").Replace("+", "").Replace("|", ""); 
     }

沿REG EX特定單詞在我上面的代碼中，我試圖將價值得到旁邊NewEBTDI$:這是。查找具有特殊字符

當我包含$登錄模式時，它不搜索字段名稱旁邊的值。

如果$被刪除，並且只是指定NewEBTDI那麼它會搜索值。

我想搜索的值與$符號。

來源

2016-01-21 Savan Patel

請妥善安排您的代碼。它不可讀。 – 2016-01-21 20:17:59

「$」在正則表達式中有特殊的含義。用\脫出它。但在你的情況下，你將不得不做一個String.Replace（）方法，因爲你的正則表達式是生成的。您可能還有其他特殊字符... –

正則表達式在正則表達式中具有特殊意義，但必須按原樣搜索的正確方法是逃避它們。你可以用Regex.Escape來做到這一點。在你的情況下，這是$符號，這意味着結束行在正則表達式，如果不逃脫。

string regex = "" + Regex.Escape(key) + ":" + "\"(?<" + Regex.Escape(GetCleanKey(key)) 
       + ">[^\"]*)\"";

或

string regex = String.Format("{0}:\"(?<{1}>[^\"]*)\"", 
          Regex.Escape(key), 
          Regex.Escape(GetCleanKey(key)));

或用VS 2015年，使用字符串插值：

string regex = $"{Regex.Escape(key)}:\"(?<{Regex.Escape(GetCleanKey(key))}>[^\"]*)\"";

（它看起來比現實更好，因爲C＃編輯器顏色的字符串部分和嵌入的C＃表達式不同）。

來源

2016-01-21 20:26:44

我對Regex.Escape不瞭解！ –

謝謝它爲我工作！ –

目前尚不清楚最終目標是什麼，但模式中的$是一種模式轉義，意味着該行的末尾或緩衝區的末尾，具體取決於是否設置了MultiLine。

爲什麼不只是將:之前的文本捕獲到一個命名的捕獲？然後提取引述操作價值，如：

var data = "...is for NewFinancial History:\"xyz\" dsd NewFinancial History:\"abc\" NewEBTDI$:\"abc\" dsds"; 

var pattern = @" 
(?<New>New[^:]+)  # Capture all items after `New` that is *not* (`^`) a `:`, one or more. 
:      # actual `:` 
\x22     # actual quote character begin anchor 
(?<InQuotes>[^\x22]+) # text that is not a quote, one or more 
\x22     # actual quote ending anchor 
"; 

// IgnorePatternWhitespace allows us to comment the pattern. Does not affect processing. 
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.ExplicitCapture) 
    .OfType<Match>() 
    .Select(mt => new 
    { 
     NewText = mt.Groups["New"].Value, 
     Text = mt.Groups["InQuotes"].Value 
    });

結果

注意我用的是十六進制轉義\x22，而不是逃避的模式\"更容易與它一起工作的。因爲它避免了C＃編譯器過早地逃避需要保持完整的模式轉義。

來源

2016-01-22 02:57:21 OmegaMan

謝謝它爲我工作！ –

查找具有特殊字符

回答

相關問題