提取ID和`例HTML`取代一切

-1

新的正則表達式，我想在我的HTML下面的文本，並想用別的東西來代替提取ID和`例HTML`取代一切

示例HTML：

{{Object id='foo'}}

提取ID爲這樣的變量：

string strId = "foo";

到目前爲止，我有以下的正則表達式的代碼，將捕獲的示例HTML：

string strStart = "Object"; 
string strFind = "{{(" + strStart + ".*?)}}"; 
Regex regExp = new Regex(strFind, RegexOptions.IgnoreCase); 

Match matchRegExp = regExp.Match(html); 

while (matchRegExp.Success) 
{ 

    //At this point, I have this variable: 
    //{{Object id='foo'}} 

    //I can find the id='foo' (see below) 
    //but not sure how to extract 'foo' and use it 

    string strFindInner = "id='(.*?)'"; //"{{Slider"; 
    Regex regExpInner = new Regex(strFindInner, RegexOptions.IgnoreCase); 
    Match matchRegExpInner = regExpInner.Match(matchRegExp.Value.ToString()); 

    //Do something with 'foo' 

    matchRegExp = matchRegExp.NextMatch(); 
}

我理解這可能是一個簡單的解決方案，我希望能獲得更多的知識有關正則表達式，但更重要的是，我希望能收到關於如何處理這種更清潔，更有效的建議。

謝謝

編輯：

這是我可能會用一個例子：c# regex replace

來源

2017-08-17 Derek

停！一邊看一邊聽！每天都有人以用正則表達式解析Html的好主意醒來。 Nothing Parse Html比Xml解析器更好。雖然你問你的問題的方式可能隱藏有多難！使用'{{''而不是'<>'可以隱藏解析像「> _ <<3 I luv you => _o /」這樣的註釋的事實，可以將你的正則表達式變成惡夢。在你的頭正則表達式是一個簡單的「尋找這個」它不是！解析html正則表達式必須進行recusive，並且每次都重新開始。使用解析器和你的代碼將會很簡單，就像在js中一樣。 –

謝謝，我重視您的意見，RegEx似乎是簡單的方法，但似乎不是。我試圖進入'SubString'和'IndexOf'，因爲我試圖做一些類似於WordPress的doShortCode（）完成的事情，並能夠找到關於當前如何工作的文檔。我期待得到一個概念證明，並從那裏繼續前進。 – Derek

使用Html解析器作爲[Html Agility Pack（HAP）]（http://html-agility-pack.net/?z=codeplex）。一個簡單的nuget和bim你可以在html中選擇你想要的任何東西。學習沒有什麼東西可以學習，這並不難。 –

雖然我沒有解決我的正則表達式最初的問題，我沒有移動到一個簡單的解決方案暫時使用SubString，IndexOf和string.Split，我知道我的代碼需要清理，但我認爲我會公佈迄今爲止的答案。

string html = "<p>Start of Example</p>{{Object id='foo'}}<p>End of example</p>" 
string strObject = "Slider"; //Example 

//When found, this will contain "{{Object id='foo'}}" 
string strCode = ""; 

//ie: "id='foo'" 
string strCodeInner = ""; 

//Tags will be a list, but in this example, only "id='foo'" 
string[] tags = { }; 

//Looking for the following "{{Object " 
string strFindStart = "{{" + strObject + " "; 
int intFindStart = html.IndexOf(strFindStart); 

//Then ending in the following 
string strFindEnd = "}}"; 
int intFindEnd = html.IndexOf(strFindEnd) + strFindEnd.Length; 

//Must find both Start and End conditions 
if (intFindStart != -1 && intFindEnd != -1) 
{ 
    strCode = html.Substring(intFindStart, intFindEnd - intFindStart); 

    //Remove Start and End 
    strCodeInner = strCode.Replace(strFindStart, "").Replace(strFindEnd, ""); 

    //Split by spaces, this needs to be improved if more than IDs are to be used 
    //but for proof of concept this is perfect 
    tags = strCodeInner.Split(new char[] { ' ' }); 
} 

Dictionary<string, string> dictTags = new Dictionary<string, string>(); 
foreach (string tag in tags) 
{ 
    string[] tagSplit = tag.Split(new char[] { '=' }); 
    dictTags.Add(tagSplit[0], tagSplit[1].Replace("'", "").Replace("\"", "")); 
} 

//At this point, I can replace "{{Object id='foo'}}" with anything I'd like 
//What I don't show is that I go into the website's database, 
//get the object (ie: Slider) and return the html for slider with the ID of foo 
html = html.Replace(strCode, strView); 

/* 
    "html" variable may contain: 

    <p>Start of Example</p> 
    <p id="foo">This is the replacement text</p> 
    <p>End of example</p> 

*/

來源

2017-08-21 00:45:15 Derek

提取ID和`例HTML`取代一切

回答

相關問題