2014-05-12 21 views
0

我想使用正則表達式爲分隔線分割長字符串。 行可以包含任何可能的unicode字符。 線在點(「。」 - 一個或多個)或換行(「\ n」)上「結束」。按分隔符分割而不將其從字符串中移除

實施例:

  • 此字符串將是輸入::

    "line1. line2.. line3... line4.... line5..... line6 
    \n 
    line7" 
    

    輸出 「LINE1」。

  • 「2號線。」
  • 「3號線......」
  • 「4號線......」
  • 「LINE5 .....」
  • 「LINE6」
  • 「line7」
+0

從你的例子看來,你只是想分割空白?例如inputString.Split(new char [] {'','\ r','\ n','\ t'},StringSplitOptions.RemoveEmptyEntries) –

回答

1

如果我明白你問什麼,你可以嘗試這樣的模式:

(?<=\.)(?!\.)|\n 

這將拆分其上由.前面,但後面沒有.任何位置的字符串或 a \n字符。

注意,這種模式保留點之後的任何空白,例如:

var input = @"line1. line2.. line3... line4.... line5..... line6\nline7"; 
var output = Regex.Split(input, @"(?<=\.)(?!\.)|\n"); 

主要生產

line1. 
line2.. 
line3... 
line4.... 
line5..... 
line6 
line7 

如果你想擺脫空白簡單地將其更改爲:

(?<=\.)(?!\.)\s*|\n 

但是如果你知道點總會是後面跟着空格,你可以簡化爲:

(?<=\.)\s+|\n 
1

試試這個:

String result = Regex.Replace(subject, @"""?(\w+([.]+)?)(?:[\n ]|[""\n]$)+", @"""$1""\n"); 

/* 
"line1." 
"line2.." 
"line3..." 
"line4...." 
"line5....." 
"line6" 
"line7" 
*/ 

正則表達式說明

"?(\w+([.]+)?)(?:[\n ]|["\n]$)+ 

Match the character 「"」 literally «"?» 
    Between zero and one times, as many times as possible, giving back as needed (greedy) «?» 
Match the regular expression below and capture its match into backreference number 1 «(\w+([.]+)?)» 
    Match a single character that is a 「word character」 (letters, digits, and underscores) «\w+» 
     Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» 
    Match the regular expression below and capture its match into backreference number 2 «([.]+)?» 
     Between zero and one times, as many times as possible, giving back as needed (greedy) «?» 
     Match the character 「.」 «[.]+» 
     Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» 
Match the regular expression below «(?:[\n ]|["\n]$)+» 
    Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» 
    Match either the regular expression below (attempting the next alternative only if this one fails) «[\n ]» 
     Match a single character present in the list below «[\n ]» 
     A line feed character «\n» 
     The character 「 」 « » 
    Or match regular expression number 2 below (the entire group fails if this one fails to match) «["\n]$» 
     Match a single character present in the list below «["\n]» 
     The character 「"」 «"» 
     A line feed character «\n» 
     Assert position at the end of the string (or before the line break at the end of the string, if any) «$» 
0

如果你想保留的所有點的完整和點之後,將一個空的空間,那麼這可能是你的正則表達式:

String result = Regex.Replace(t, @".\s", @".\n"); 

這將是一個字符串。你還沒有說明你是否想要更多的字符串或結果。