刪除除標記外的所有html標記

我有一些代碼可以刪除所有的html標記，但是我想要除去除</td>和</tr>之外的所有html標記。刪除除標記外的所有html標記

這怎麼辦？

public string HtmlStrip(string input) 
{ 
    input = Regex.Replace(input, "<input>(.|\n)*?</input>", "*"); 
    input = Regex.Replace(input, @"<xml>(.|\n)*?</xml>", "*"); // remove all <xml></xml> tags and anything inbetween. 
    return Regex.Replace(input, @"<(.|\n)*?>", "*"); // remove any tags but not there content "<p>bob<span> johnson</span></p>" becomes "bob johnson" 
}

來源

2013-03-21 Ashekur Rahman Molla Asik

+10

只記得http://stackoverflow.com/a/1732454/1283124 – 2013-03-21 20:39:56

看到，但不要;噸瞭解... – 2013-03-21 20:41:49

@IlyaIvanov提醒我確切的一樣。 OP，使用正則表達式來解析HTML是一個危險的冒險。您應該使用其他方法（如將HTML代表爲XML）。 – tnw 2013-03-21 20:42:11