2012-08-29 95 views
1

任何人都知道我會如何找到&替換字符串中的文本?基本上我有兩個字符串:使用C#查找並替換字符串中的文本

string firstS = "/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDABQODxIPDRQSERIXFhQYHzMhHxwcHz8tLyUzSkFOTUlBSEZSXHZkUldvWEZIZoxob3p9hIWET2ORm4+AmnaBhH//2wBDARYXFx8bHzwhITx/VEhUf39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f3//"; 

string secondS = "abcdefg2wBDABQODxIPDRQSERIXFh/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/abcdefg"; 

我要搜索firstS,看它是否包含在secondS的任何字符序列,然後替換它。它還需要與替換的字符的平方括號中的數字所取代:

[NUMBER-OF-CHARACTERS置換]

例如,由於firstSsecondS都包含 「2wBDABQODxIPDRQSERIXFh」 和「/ f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39/f39 /」,則需要更換。那麼firstS變成:

string firstS = "/9j/4AAQSkZJRgABAQEAYABgAAD/[22]QYHzMhHxwcHz8tLyUzSkFOTUlBSEZSXHZkUldvWEZIZoxob3p9hIWET2ORm4+AmnaBhH//2wBDARYXFx8bHzwhITx/VEhUf39[61]f3//"; 

希望是有道理的。我想我可以用正則表達式來做到這一點,但我不喜歡它的低效率。有人知道另一種更快的方法嗎?

+0

http://en.wikipedia.org/wiki/Longest_common_substring_problem –

回答

3

有沒有人知道另一種更快的方式?

是的,這個問題實際上有一個正確的名稱。它被稱爲Longest Common Substring,它有一個reasonably fast solution

這是an implementation on ideone。它會查找並替換十個字符或更長的所有常見子字符串。

// This comes straight from Wikipedia article linked above: 
private static string FindLcs(string s, string t) { 
    var L = new int[s.Length, t.Length]; 
    var z = 0; 
    var ret = new StringBuilder(); 
    for (var i = 0 ; i != s.Length ; i++) { 
     for (var j = 0 ; j != t.Length ; j++) { 
      if (s[i] == t[j]) { 
       if (i == 0 || j == 0) { 
        L[i,j] = 1; 
       } else { 
        L[i,j] = L[i-1,j-1] + 1; 
       } 
       if (L[i,j] > z) { 
        z = L[i,j]; 
        ret = new StringBuilder(); 
       } 
       if (L[i,j] == z) { 
        ret.Append(s.Substring(i-z+1, z)); 
       } 
      } else { 
       L[i,j]=0; 
      } 
     } 
    } 
    return ret.ToString(); 
} 
// With the LCS in hand, building the answer is easy 
public static string CutLcs(string s, string t) { 
    for (;;) { 
     var lcs = FindLcs(s, t); 
     if (lcs.Length < 10) break; 
     s = s.Replace(lcs, string.Format("[{0}]", lcs.Length)); 
    } 
    return s; 
} 
1
0

我有一個類似的問題,但對於出現的詞語!所以,我希望這可以幫助。我用SortedDictionary和二叉搜索樹

/* Application counts the number of occurrences of each word in a string 
    and stores them in a generic sorted dictionary. */ 
using System; 
using System.Text.RegularExpressions; 
using System.Collections.Generic; 

public class SortedDictionaryTest 
{ 
    public static void Main(string[] args) 
    { 
     // create sorted dictionary 
     SortedDictionary< string, int > dictionary = CollectWords(); 

     // display sorted dictionary content 
     DisplayDictionary(dictionary); 
    } 

    // create sorted dictionary 
    private static SortedDictionary< string, int > CollectWords() 
    { 
     // create a new sorted dictionary 
     SortedDictionary< string, int > dictionary = 
     new SortedDictionary< string, int >(); 

     Console.WriteLine("Enter a string: "); // prompt for user input 
     string input = Console.ReadLine(); 

     // split input text into tokens 
     string[] words = Regex.Split(input, @"\s+"); 

     // processing input words 
     foreach (var word in words) 
     { 
     string wordKey = word.ToLower(); // get word in lowercase 

     // if the dictionary contains the word 
     if (dictionary.ContainsKey(wordKey)) 
     { 
      ++dictionary[ wordKey ]; 
     } 
     else 
      // add new word with a count of 1 to the dictionary 
      dictionary.Add(wordKey, 1); 
     } 

     return dictionary; 
    } 

    // display dictionary content 
    private static void DisplayDictionary< K, V >(
     SortedDictionary< K, V > dictionary) 
    { 
     Console.WriteLine("\nSorted dictionary contains:\n{0,-12}{1,-12}", 
     "Key:", "Value:"); 

     /* generate output for each key in the sorted dictionary 
     by iterating through the Keys property with a foreach statement*/ 
     foreach (K key in dictionary.Keys) 
     Console.WriteLine("{0,- 12}{1,-12}", key, dictionary[ key ]); 

     Console.WriteLine("\nsize: {0}", dictionary.Count); 
    } 
} 
0

這可能是狗緩慢,但如果你願意承擔一些技術債務,需要現在進行原型設計的東西,你可以使用LINQ。

string firstS = "123abc"; 
string secondS = "456cdeabc123"; 
int minLength = 3; 

var result = 
    from subStrCount in Enumerable.Range(0, firstS.Length) 
    where firstS.Length - subStrCount >= 3 
    let subStr = firstS.Substring(subStrCount, 3) 
    where secondS.Contains(subStr) 
    select secondS.Replace(subStr, "[" + subStr.Length + "]"); 

結果

456cdeabc[3] 
456cde[3]123