2017-01-31 65 views
-3

我必須在C#中創建一個字符串解析器。字符串需要在父子關係被解析,字符串是這樣的:清理速度最快,效率最高的解析c字符串的方法#

Water, Bulgur Wheat (29%), Sweetened Dried Cranberries (5%) (Sugar, Cranberries), Sunflower Seeds (3%), Onion (3%), Green Lentils (2%), Palm Oil, Flavourings (contain Barley), Lemon Juice Powder (<2%) (Maltodextrin, Lemon Juice Concentrate), Ground Spices (<2%) (Paprika, Black Pepper, Cinnamon, Coriander, Cumin, Chilli Powder, Cardamom, Pimento, Ginger), Dried Herbs (<2%) (Coriander, Parsley, Mint), Dried Garlic (<2%), Salt, Maltodextrin, Onion Powder (<2%), Cumin Seeds, Dried Lemon Peel (<2%), Acid (Citric Acid) 

我知道我可以用炭炭去,並最終通過它找到我的路,但什麼是獲得此信息的最簡單的方法。

預期輸出: -

enter image description here

+0

@Anand:感謝您的答覆。我用圓括號替換了所有括號,並以樹形結構對它進行了分解 – Supreet

+0

我的評論的最後部分仍未得到答覆。有什麼期望呢?假設你將這個字符串傳遞給一個函數,你期待什麼? – A3006

+0

@Anand:請找到預期的輸出 – Supreet

回答

0
public static string ParseString(string input) 
{ 
    StringBuilder sb = new StringBuilder(); 
    bool skipNext = false; // used to skip spaces after commas 
    foreach (char c in input) 
    { 
     if (!skipNext) 
     { 
      switch (c) 
      { 
       case '(': 
        sb.Append("\n\t"); 
        break; 
       case ',': 
        sb.Append("\n"); 
        skipNext = true; 
        break; 
       case ')': 
        sb.Append("\n"); 
        break; 
       default: 
        sb.Append(c); 
        break; 
      } 
     } 
     else 
     { 
      skipNext = false; 
     } 
    } 

    return sb.ToString(); 
} 

這應該讓你開始。它不處理不表示兒童的括號。

0

查看發佈的數據(水,小麥...)後,一個問題將區分/分離每個單獨的項目:1水,2 Bulgar ..,3變甜。

分割逗號「,」將不起作用,因爲在一些括號內有逗號「()」(糖,蔓越莓)。這些項目(糖,蔓越莓)是SUB項目變甜幹蔓越莓......所以拆分在逗號的字符串將無法正常工作。

從您提供的數據中,我會考慮更改其格式以適應這種情況。一個簡單的改變是將子組之間的逗號分隔符改爲別的......破折號「 - 」可能會起作用。

下面的正則表達式代碼就是這樣做的。這基本上將每個逗號「,」在一個打開和關閉括號之間「()」改爲一個短劃線「 - 」。這將允許逗號分割來識別每個項目。

private static string ReplaceCommaBetweenParens(string inString) { 
    string pattern = @"(?<=\([^\)]*)+,(?!\()(?=[^\(]*\))"; 
    return Regex.Replace(inString, pattern, "-"); 
} 

上面的代碼是不漂亮,我從別的地方得到這個代碼,並希望我能現場原作者。我歡迎所有Regex愛好者對這種模式進行評論。我不知道如何使用常規字符串方法(split/indexof)來完成此操作。我相信這需要幾個步驟。 Regex在某些情況下有多有用的一個很好的例子。它可能很難看,但它的運作速度非常快。幸運的是,上述隱藏代碼(Regex)在這一步之後不會有太大的幫助。

一旦進行了此更改,根據需要縮進輸出是相當直接的過程。下面的代碼讀取DataTable的每一行。每行可能有一個或多個項目分隔我的逗號「,」。代碼遍歷每行解析字符串中的項目。我做了一個簡單的課程來保存這些項目;然而,如果不需要一個類,那麼代碼就會帶有正確的輸出。希望這可以幫助。

簡單的類來保存單個項目

class Ingredient { 

    int ID { get; set; } 
    string Name { get; set; } 
    string Percent { get; set; } 
    List<string> Ingredients { get; set; } 

    public Ingredient(int id, string name, string pct, List<string> ingredients) { 
    ID = id; 
    Name = name; 
    Percent = pct; 
    Ingredients = ingredients; 
    } 

    public override string ToString() { 
    StringBuilder sb = new StringBuilder(); 
    sb.Append(ID + "\t" + Name + " " + Percent + Environment.NewLine); 
    foreach (string s in Ingredients) { 
     sb.Append("\t\t" + s + Environment.NewLine); 
    } 
    return sb.ToString(); 
    } 
} 

代碼使用上面的類

static string ingredients = "Water, Bulgur Wheat(29%), Sweetened Dried Cranberries(5%) (Sugar, Cranberries)," + 
           " Sunflower Seeds(3%), Onion(3%), Green Lentils(2%), Palm Oil, Flavourings (contain Barley)," + 
           " Lemon Juice Powder(<2%) (Maltodextrin, Lemon Juice Concentrate)," + 
           " Ground Spices(<2%) (Paprika, Black Pepper, Cinnamon, Coriander, Cumin, Chilli Powder, Cardamom, Pimento, Ginger)," + 
           " Dried Herbs(<2%) (Coriander, Parsley, Mint), Dried Garlic(<2%), Salt, Maltodextrin, Onion Powder(<2%)," + 
           " Cumin Seeds, Dried Lemon Peel(<2%), Acid(Citric Acid)"; 

static List<Ingredient> allIngredients; 

static void Main(string[] args) { 
    allIngredients = ParseString(ingredients); 
    foreach (Ingredient curIngredient in allIngredients) { 
    Console.Write(curIngredient.ToString()); 
    } 
    Console.ReadLine(); 
} 

private static List<Ingredient> ParseString(string inString) { 
    List<Ingredient> allIngredients = new List<Ingredient>(); 
    string temp = ReplaceCommaBetweenParens(ingredients); 
    string[] allItems = temp.Split(','); 
    int count = 1; 
    foreach (string curItem in allItems) { 
    if (curItem.Contains("(")) { 
     allIngredients.Add(ParseItem(curItem, count)); 
    } 
    else { 
     allIngredients.Add(new Ingredient(count, curItem.Trim(), "", new List<string>())); 
     //Console.WriteLine(count + "\t" + curItem.Trim()); 
    } 
    count++; 
    } 
    return allIngredients; 
} 

private static Ingredient ParseItem(string item, int count) { 
    string pct = ""; 
    List<string> items = new List<string>(); 
    int firstParenIndex = item.IndexOf("("); 
    //Console.Write(count + "\t" + item.Substring(0, firstParenIndex).Trim()); 

    Regex expression = new Regex(@"\((.*?)\)"); 
    MatchCollection matches = expression.Matches(item); 
    bool percentPresent = true; 
    foreach (Match match in matches) { 
    if (match.ToString().Contains("%")) { // <-- if the string between parenthesis does not contain "%" - move to next line, otherwise print on same line 
     //Console.WriteLine(" " + match.ToString().Trim()); 
     pct = match.ToString().Trim(); 
     percentPresent = false; 
    } 
    else { 
     if (percentPresent) { 
     //Console.WriteLine(); 
     } 
     items = GetLastItems(match.ToString().Trim()); 
    } 
    } 
    return new Ingredient(count, item.Substring(0, firstParenIndex).Trim(), pct, items); 
} 

private static List<string> GetLastItems(string inString) { 
    List<string> result = new List<string>(); 
    string temp = inString.Replace("(", ""); 
    temp = temp.Replace(")", ""); 
    string[] allItems = temp.Split('-'); 
    foreach (string curItem in allItems) { 
    //Console.WriteLine("\t\t" + curItem.Trim()); 
    result.Add(curItem.Trim()); 
    } 
    return result; 
} 

private static string ReplaceCommaBetweenParens(string inString) { 
    string pattern = @"(?<=\([^\)]*)+,(?!\()(?=[^\(]*\))"; 
    return Regex.Replace(inString, pattern, "-"); 
} 
+0

非常感謝:) – Supreet

相關問題