如何使用c＃中的正則表達式在< and >之間移除字符？

我有一個字符串str="<u>rag</u>"。現在，我只想得到字符串"rag"。我怎樣才能得到它使用正則表達式？如何使用c＃中的正則表達式在< and >之間移除字符？

我的代碼是在這裏..

我得到的輸出= 「」

在此先感謝..

C＃代碼：

string input="<u>ragu</u>"; 
string regex = "(\\<.*\\>)"; 
string output = Regex.Replace(input, regex, "");

2013-04-10 ragu

反斜槓太多... – Floris 2013-04-10 12:15:43

是它的html嗎？或簡單的文字？ – pordi 2013-04-10 12:15:54

@PradipKT它是html .. – ragu 2013-04-10 12:17:28

使用regex分析HTML不建議

regex用於定期發生的模式。 html與其格式不同（xhtml除外）。例如，html文件即使您的不具有也有closing tag！這可能會破壞您的代碼。

使用HTML解析器像htmlagilitypack

警告 {不要在你的代碼試試這個}

解決您的正則表達式的問題！

<.*>替換後面跟着0到許多字符（即u>rag</u）<直到最後>

你應該用這個表達式

<.*?>

.*是貪婪也就是說，它會爲吃飯更換許多字符，因爲它匹配

.*?是懶惰即它會吃盡可能少的字符

2013-04-10 12:19:56 Anirudha

感謝您的解釋..我明白了..這是作品.. – ragu 2013-04-10 12:27:55

const string HTML_TAG_PATTERN = "<.*?>"; 
Regex.Replace (str, HTML_TAG_PATTERN, string.Empty);

2013-04-10 12:16:04 cosset

+1是因爲第一個想出這個簡單的非貪婪的表情。 – Floris 2013-04-10 12:24:12

當然可以：

string input = "<u>ragu</u>"; 
    string regex = "(\\<[/]?[a-z]\\>)"; 
    string output = Regex.Replace(input, regex, "");

2013-04-10 12:16:59

你不需要使用正則表達式。

string input = "<u>rag</u>".Replace("<u>", "").Replace("</u>", ""); 
Console.WriteLine(input);

2013-04-10 12:17:25

你的代碼幾乎是正確的，一個小的修改使得它的工作：

string input = "<u>ragu</u>"; 
string regex = @"<.*?\>"; 
string output = Regex.Replace(input, regex, string.empty);

輸出 '肉醬'。

編輯：這個解決方案可能不是最好的。來自用戶的有趣評論 - 土地 - 魔鬼 - 斯里蘭卡：不要使用正則表達式來解析HTML。的確，另請參閱RegEx match open tags except XHTML self-contained tags。

2013-04-10 12:18:07

回答