2011-10-26 79 views
2

我正在使用C#來查找可能存在或可能不存在於博客文章中的短語。我需要捕捉包含目標短語的整個句子。用正確的短語捕獲句子的正則表達式

我想過使用string.contains方法,但是當我想要的是目標短語及其包含的句子時,它會返回整個博客文章。

例子:

I dont want this sentence. I also don't want this setence. But I do want this sentence. 

所以這裏的目標短語就是:「我願意」和正則表達式應該返回整個句子含有「但我想這句話。」

謝謝。 亞倫

回答

2

此正則表達式:

resultString = Regex.Match(subjectString, @"(?<=^|\.)[^.]*?(?=\bI do\b).*(\.|$)").Value; 

當適用於您的輸入:

I dont want this sentence. I also don't want this setence. But I do want this sentence. 

返回:

But I do want this sentence. 

打開RegexOptions.Singleline如果你擔心多行。

+0

thx。那非常好用 – Aaron

1

我不知道正則表達式的,但你可以使用Split功能的組合和Contains功能和寫是這樣的:

string DoesBlogContainSentence(string blog, string target) 
{ 
    string[] blogSentences = blog.Split(new char[] {'.'}); 

    foreach(string sentence in blogSentences) 
    { 
     if(sentence.Contains(target)) 
     { 
      return sentence; 
     } 
    } 

    return string.Empty; 
} 
+0

拆分'。'單獨不一定只會返回句子。例如,如果你有一個十進制數,這將被拆分。 – wdavo

1

你可能分裂的博客文章成句子,然後搜索每個句子的目標短語。

E.g.

string data = "I dont want this sentence. I also don't want this setence. But I do want this sentence."; 
    string targetPhrase = "I do"; 

    string[] sentences = Regex.Split(data, "\\.\\s"); 

    foreach (string sentence in sentences) 
    { 
    if (Regex.IsMatch(sentence, "\\s" + targetPhrase + "\\s")) 
    { 
     //..... 
    } 
    }