2017-06-08 36 views
0

我有一個字符串(HTML)被髮布到服務器端,然後使用HTMLAgility包進行驗證。在HTML中有一個未封閉的colgroup標籤。HTMLAgility pack C#unclosed colgroup標記

消毒後,收盤COLGROUP標記出現,但權收盤 「TBODY」 和 「表」 標籤之間

BEFORE:

<table width="3265" class="mce-item-table" style="width: 2452pt; border-collapse: collapse;" border="0" cellspacing="0" cellpadding="0"> 
 

 
<colgroup><col width="80" style="width: 60pt;"> 
 
<col width="245" style="width: 184pt;" span="13"> <!-- MISSING COLGROUP tag--> 
 
<tbody><tr height="20" style="height: 15pt;"> 
 
    <td width="80" height="20" style="width: 60pt; height: 15pt; color: blue; text-decoration: underline; text-underline-style: single;"><span style="color: blue;">31109173</span></td> 
 
    <td width="245" style="width: 184pt; font-family: Arial; font-size: 9pt;">31109173</td> 
 
    <td width="245" align="right" style="width: 184pt; font-family: Arial; font-size: 9pt;">May 09,2017 9:54 AM</td> 
 
    <td width="245" align="right" style="width: 184pt; font-family: Arial; font-size: 9pt;">May 08,2017 5:21 PM</td> 
 
</tr> 
 
<tr height="20" style="height: 15pt;"> 
 
    <td height="20" style="height: 15pt; color: blue; text-decoration: underline; text-underline-style: single;"><span style="color: blue;">30933775</span></td> 
 
    <td style="font-family: Arial; font-size: 9pt;">30933775</td> 
 
    <td align="right" style="font-family: Arial; font-size: 9pt;">May 09,2017 9:50 AM</td> 
 
    <td align="right" style="font-family: Arial; font-size: 9pt;">Apr 28,2017 6:22 PM</td> 
 
</tr> 
 
</tbody></table>

AFTER:

<table width="3265" class="mce-item-table" style="width: 2452pt; border-collapse: collapse;" border="0" cellspacing="0" cellpadding="0"> 
 

 
<colgroup><col width="80" style="width: 60pt;"> 
 
<col width="245" style="width: 184pt;" span="13"> 
 
<tbody><tr height="20" style="height: 15pt;"> 
 
    <td width="80" height="20" style="width: 60pt; height: 15pt; color: blue; text-decoration: underline; text-underline-style: single;"><span style="color: blue;">31109173</span></td> 
 
    <td width="245" style="width: 184pt; font-family: Arial; font-size: 9pt;">31109173</td> 
 
    <td width="245" align="right" style="width: 184pt; font-family: Arial; font-size: 9pt;">May 09,2017 9:54 AM</td> 
 
    <td width="245" align="right" style="width: 184pt; font-family: Arial; font-size: 9pt;">May 08,2017 5:21 PM</td> 
 
</tr> 
 
<tr height="20" style="height: 15pt;"> 
 
    <td height="20" style="height: 15pt; color: blue; text-decoration: underline; text-underline-style: single;"><span style="color: blue;">30933775</span></td> 
 
    <td style="font-family: Arial; font-size: 9pt;">30933775</td> 
 
    <td align="right" style="font-family: Arial; font-size: 9pt;">May 09,2017 9:50 AM</td> 
 
    <td align="right" style="font-family: Arial; font-size: 9pt;">Apr 28,2017 6:22 PM</td> 
 
</tr> 
 
</tbody></colgroup></table> 
 

 
<!-- ^^ </colgroup> has appeared above-->

我試着將「OptionFixNestedTags」標誌設置爲true。我仍然得到相同的結果。

回答

0

我嘗試了HTMLAgility包中的各種選項並將它們設置爲true。這沒有奏效。

OptionFixNestedTags = true; 
OptionAutoCloseOnEnd = true; 

有一個很好的Nuget包來清理html。我遇到的問題是在這裏解決 - >HtmlSanitizer

希望這會有所幫助。