我試圖計算兩個字符串獲得2串
例如
string val1 = "Have a good day";
string val2 = "Have a very good day, Joe";
其結果將是字符串列表之間的差異之間的差異,2項「非常」和「喬」
到目前爲止,我研究這個任務還沒有止跌回升多
編輯:結果很可能將需要2個獨立的字符串列表,一個持有增加,和一個持有清除
我試圖計算兩個字符串獲得2串
例如
string val1 = "Have a good day";
string val2 = "Have a very good day, Joe";
其結果將是字符串列表之間的差異之間的差異,2項「非常」和「喬」
到目前爲止,我研究這個任務還沒有止跌回升多
編輯:結果很可能將需要2個獨立的字符串列表,一個持有增加,和一個持有清除
其實我遵循這個步驟,
(I)Obtain all words
從兩個詞,不論特殊字符
(ii)從兩個列表中找到差異
CODE :
string s2 = "Have a very good day, Joe";
IEnumerable<string> diff;
MatchCollection matches = Regex.Matches(s1, @"\b[\w']*\b");
IEnumerable<string> first= from m in matches.Cast<Match>()
where !string.IsNullOrEmpty(m.Value)
select TrimSuffix(m.Value);
MatchCollection matches1 = Regex.Matches(s2, @"\b[\w']*\b");
IEnumerable<string> second = from m in matches1.Cast<Match>()
where !string.IsNullOrEmpty(m.Value)
select TrimSuffix(m.Value);
if (second.Count() > first.Count())
{
diff = second.Except(first).ToList();
}
else
{
diff = first.Except(second).ToList();
}
}
static string TrimSuffix(string word)
{
int apostropheLocation = word.IndexOf('\'');
if (apostropheLocation != -1)
{
word = word.Substring(0, apostropheLocation);
}
return word;
}
OUTPUT: 非常喬
您的代碼不符合OP的預期結果。 – 2014-12-02 16:58:56
@ErikPhilips我修改了答案 – Sajeetharan 2014-12-02 17:22:50
你必須刪除 '' 爲了得到預期的結果
string s1 = "Have a good day";
string s2 = "Have a very good day, Joe";
int index = s2.IndexOf(','); <----- get the index of the char to be removed
IEnumerable<string> diff;
IEnumerable<string> first = s1.Split(' ').Distinct();
IEnumerable<string> second = s2.Remove(index, 1).Split(' ').Distinct();<--- remove it
if (second.Count() > first.Count())
{
diff = second.Except(first).ToList();
}
else
{
diff = first.Except(second).ToList();
}
這是我能想到的最簡單的版本:
class Program
{
static void Main(string[] args)
{
string val1 = "Have a good day";
string val2 = "Have a very good day, Joe";
MatchCollection words1 = Regex.Matches(val1, @"\b(\w+)\b");
MatchCollection words2 = Regex.Matches(val2, @"\b(\w+)\b");
var hs1 = new HashSet<string>(words1.Cast<Match>().Select(m => m.Value));
var hs2 = new HashSet<string>(words2.Cast<Match>().Select(m => m.Value));
// Optionaly you can use a custom comparer for the words.
// var hs2 = new HashSet<string>(words2.Cast<Match>().Select(m => m.Value), new MyComparer());
// h2 contains after this operation only 'very' and 'Joe'
hs2.ExceptWith(hs1);
}
}
public class MyComparer : IEqualityComparer<string>
{
public bool Equals(string one, string two)
{
return one.Equals(two, StringComparison.OrdinalIgnoreCase);
}
public int GetHashCode(string item)
{
return item.GetHashCode();
}
}
此編號:
enum Where { None, First, Second, Both } // somewhere in your source file
//...
var val1 = "Have a good calm day calm calm calm";
var val2 = "Have a very good day, Joe Joe Joe Joe";
var words1 = from m in Regex.Matches(val1, "(\\w+)|(\\S+\\s+\\S+)").Cast<Match>()
where m.Success
select m.Value.ToLower();
var words2 = from m in Regex.Matches(val2, "(\\w+)|(\\S+\\s+\\S+)").Cast<Match>()
where m.Success
select m.Value.ToLower();
var dic = new Dictionary<string, Where>();
foreach (var s in words1)
{
dic[s] = Where.First;
}
foreach (var s in words2)
{
Where b;
if (!dic.TryGetValue(s, out b)) b = Where.None;
switch (b)
{
case Where.None:
dic[s] = Where.Second;
break;
case Where.First:
dic[s] = Where.Both;
break;
}
}
foreach (var kv in dic.Where(x => x.Value != Where.Both))
{
Console.WriteLine(kv.Key);
}
給我們'平靜','非常','喬'和'喬',這兩個字符串是不同的;從第一個'冷靜','非常','喬'和'喬'從下一個。它也消除了重複的情況。
並獲得兩個單獨的列出了爲我們展示了其字從文本傳來:
var list1 = dic.Where(x => x.Value == Where.First).ToList();
var list2 = dic.Where(x => x.Value == Where.Second).ToList();
foreach (var kv in list1)
{
Console.WriteLine("{0}: {1}", kv.Key, kv.Value);
}
foreach (var kv in list2)
{
Console.WriteLine("{0}: {1}", kv.Key, kv.Value);
}
把人物分成兩組,然後計算這些套的相對恭維。
相關讚美將在任何優秀集合庫中提供。
您可能需要注意保留字符的順序。
你是什麼意思「你的研究」?你寫了一些代碼嗎?然後分享。你沒有寫任何東西嗎?在這種情況下,我認爲您只希望我們爲您編寫代碼是不公平的。努力! – mason 2014-12-02 16:53:34
我已經編寫了代碼並完成了研究。我的代碼甚至不像預期的那樣工作 – mrb398 2014-12-02 16:54:31
查看https://github.com/mmanela/diffplex – haim770 2014-12-02 16:54:50