2013-08-23 92 views
1

我有一個文本像字符串分割的長度和分裂只能通過最近的空間

var data = "âô¢¬ôè÷¢ : ªîø¢è¤ô¢ - ã¿ñ¬ô ñèù¢ ªð¼ñ£÷¢ ï¤ôñ¢,«ñø¢è¤ô¢ - ªð¼ñ£÷¢ ñèù¢ ÝÁºèñ¢ ï¤ô袰ñ¢ ñ¤ì¢ì£ Üò¢òñ¢ ªð¼ñ£ñ¢ð좮 è¤ó£ñ âô¢¬ô袰ñ¢,õìè¢è¤ô¢ - ÝÁºèñ¢ ï¤ôñ¢,è¤öè¢è¤ô¢ - ô좲ñ¤ ï¤ôñ¢ ñø¢Áñ¢ 1,22 ªê ï¤ôñ¢ ð£î¢î¤òñ¢"; 

和我遇到的擴展方法來分割字符串

public static IEnumerable<string> EnumByLength(this string s, int length) 
{ 
    for (int i = 0; i < s.Length; i += length) 
    { 
     if (i + length <= s.Length) 
     { 
      yield return s.Substring(i, length); 
     } 
     else 
     { 
      yield return s.Substring(i); 
     } 
    } 
} 
public static string[] SplitByLength(this string s, int maxLen) 
{ 
    var v = EnumByLength(s, maxLen); 
    if (v == null) 
     return new string[] { s }; 
    else 
     return s.EnumByLength(maxLen).ToArray(); 
} 

現在的問題是

要按最大長度150拆分此字符串,並且拆分必須僅由其中的最近空格完成..(在150之前或在之後..不在一個字的中間。

怎麼樣?

+0

所以你要'.Split( '')'基於空格的字符串?(i噸將有助於澄清空間在一個單詞中的位置) – Sayse

+0

此外,斯普利特只有在字符串索引'150'後才能執行..我問正確嗎? –

+0

這個問題應該用傳統/標準的'while和for loop'來解決,爲什麼'LINQ'? –

回答

3

我的版本:

// Enumerate by nearest space 
// Split String value by closest to length spaces 
// e.g. for length = 3 
// "abcd efghihjkl m n p qrstsf" -> "abcd", "efghihjkl", "m n", "p", "qrstsf" 
public static IEnumerable<String> EnumByNearestSpace(this String value, int length) { 
    if (String.IsNullOrEmpty(value)) 
    yield break; 

    int bestDelta = int.MaxValue; 
    int bestSplit = -1; 

    int from = 0; 

    for (int i = 0; i < value.Length; ++i) { 
    var Ch = value[i]; 

    if (Ch != ' ') 
     continue; 

    int size = (i - from); 
    int delta = (size - length > 0) ? size - length : length - size; 

    if ((bestSplit < 0) || (delta < bestDelta)) { 
     bestSplit = i; 
     bestDelta = delta; 
    } 
    else { 
     yield return value.Substring(from, bestSplit - from); 

     i = bestSplit; 

     from = i + 1; 
     bestSplit = -1; 
     bestDelta = int.MaxValue; 
    } 
    } 

    // String's tail 
    if (from < value.Length) { 
    if (bestSplit >= 0) { 
     if (bestDelta < value.Length - from) 
     yield return value.Substring(from, bestSplit - from); 

     from = bestSplit + 1; 
    } 

    if (from < value.Length) 
     yield return value.Substring(from); 
    } 
} 

... 

var list = data.EnumByNearestSpace(150).ToList(); 
+0

謝謝先生.... :-) –

+0

我發現'String's tail'行有問題'from = bestSplit + 1; '應該在​​上面的if語句塊內。例子'Console.WriteLine(string.Join(「#」,EnumByNearestSpace(「感謝您與我們一起購物!我們非常感謝您!」,40)));''會導致'欣賞'缺失。 –

0

你去那裏:

for (int i = 0; i < s.Length; i += length) 
    { 
     int index=s.IndexOf(" ",i, s.Length-i) 

     if (index!=-1 && index + length <= s.Length) 
     { 
      i =index;   
      yield return s.Substring(index, length); 
     } 
     else 
     { 
      index= s.LastIndexOf(" ", 0, i); 
      if(index==-1) 
       yield return s.Substring(i); 
      else 
      { 
       i = index; 
       yield return s.Substring(i); 
      } 
     } 
    } 
+0

AAhh ...不幸的是,這並不奏效。詞語在最後一行的新行中重複出現。對不起... –

+0

@Gokul現在嘗試,修復它 – sara

+0

原因參數超出範圍異常 – fubo

1

我的版本

var data = "âô¢¬ôè÷¢ : ªîø¢è¤ô¢ - ã¿ñ¬ô ñèù¢ ªð¼ñ£÷¢ ï¤ôñ¢,«ñø¢è¤ô¢ - ªð¼ñ£÷¢ ñèù¢ ÝÁºèñ¢ ï¤ô袰ñ¢ ñ¤ì¢ì£ Üò¢òñ¢ ªð¼ñ£ñ¢ð좮 è¤ó£ñ âô¢¬ô袰ñ¢,õìè¢è¤ô¢ - ÝÁºèñ¢ ï¤ôñ¢,è¤öè¢è¤ô¢ - ô좲ñ¤ ï¤ôñ¢ ñø¢Áñ¢ 1,22 ªê ï¤ôñ¢ ð£î¢î¤òñ¢"; 

var indexes = new List<int>(); 
var lastFoundIndex = 0; 
while((lastFoundIndex = data.IndexOf(' ', lastFoundIndex + 1)) != -1) 
{ 
    indexes.Add(lastFoundIndex); 
} 

int intNum = 150; 
int index; 
var newList = new List<string>(); 
while ((index = indexes.Where(x => x > intNum - 150 && x <= intNum).LastOrDefault()) != 0) 
{ 
    var firstIndex = newList.Count == 0 ? 0 : index; 
    var lastIndex = firstIndex + 150 >= data.Length ? data.Length - 150 : intNum; 
    newList.Add(data.Substring(intNum - 150, lastIndex)); 
    intNum += 150; 
} 

newList包含分割字符串

+0

經過測試,感謝您與我們一起購物!我們非常感謝你!「分裂成40個字符。它真的在分裂。 –