保留HTML標記來XML轉換

-1

我有一個JSON對象，我使用以下代碼轉換成XML：保留HTML標記來XML轉換

private string ConvertFileToXml(string file) 
{ 
    string fileContent = File.ReadAllText(file); 
    XmlDocument doc = JsonConvert.DeserializeXmlNode(fileContent, "root"); 

    // Retain html tags. 
    doc.InnerXml = HttpUtility.HtmlDecode(doc.InnerXml); 

    return XDocument.Parse(doc.InnerXml).ToString(); 
}

其中string json是下列對象：

{ 
    "id": "2639", 
    "type": "www.stack.com", 
    "bodyXML": "\n<body><p>Democrats also want to 「reinvigorate and modernise」 US <ft-content type=\"http://www.stack.com/ontology/content/Article\" url=\"http://api.stack.com/content/d2c32614-61c6-11e7-91a7-502f7ee26895\">antitrust</ft-content> laws for a broad attack on corporations.</p>\n<p>Mr Schumer said the Democrats’ new look should appeal to groups that backed Mrs Clinton, such as the young and minority groups, and members of the white working-class who deserted Democrats for Mr Trump. </p>\n</body>", 
    "title": "Democrats seek to reclaim populist mantle from Donald Trump", 
    "standfirst": "New economic plan is pitched as an assault on growing corporate power", 
    "byline": "David J Lynch in Washington", 
    "firstPublishedDate": "2017-07-24T17:51:25Z", 
    "publishedDate": "2017-07-24T17:50:25Z", 
    "requestUrl": "http://api.stack.com/content/e8bec6dc-708d-11e7-aca6-c6bd07df1a3c", 
    "brands": [ 
    "http://api.ft.com/things/dbb0bdae-1f0c-11e4-b0cb-b2227cce2b54" 
    ], 
    "standout": { 
    "editorsChoice": false, 
    "exclusive": false, 
    "scoop": false 
    }, 
    "canBeSyndicated": "yes", 
    "webUrl": "http://www.stack.com/cms/s/e8bec6dc-708d-11e7-aca6-c6bd07df1a3c.html" 
}

和輸出的方法生成此：

<root> 
    <id>2639</id> 
    <type>www.stack.com</type> 
    <bodyXML> 
&lt;p&gt;Democrats also want to 「reinvigorate and modernise」 US &lt;ft-content type="http://www.stack.com/ontology/content/Article" url="http://api.stack.com/content/d2c32614-61c6-11e7-91a7-502f7ee26895"&gt;antitrust&lt;/ft-content&gt; laws for a broad attack on corporations.&lt;/p&gt; 
&lt;p&gt;Mr Schumer said the Democrats’ new look should appeal to groups that backed Mrs Clinton, such as the young and minority groups, and members of the white working-class who deserted Democrats for Mr Trump. &lt;/p&gt; 
&lt;/body&gt;</bodyXML> 
    <title>Democrats seek to reclaim populist mantle from Donald Trump</title> 
    <standfirst>New economic plan is pitched as an assault on growing corporate power</standfirst> 
    <byline>David J Lynch in Washington</byline> 
    <firstPublishedDate>2017-07-24T17:51:25Z</firstPublishedDate> 
    <publishedDate>2017-07-24T17:50:25Z</publishedDate> 
    <requestUrl>http://api.stack.com/content/e8bec6dc-708d-11e7-aca6-c6bd07df1a3c</requestUrl> 
    <brands>http://api.ft.com/things/dbb0bdae-1f0c-11e4-b0cb-b2227cce2b54</brands> 
    <standout> 
    <editorsChoice>false</editorsChoice> 
    <exclusive>false</exclusive> 
    <scoop>false</scoop> 
    </standout> 
    <canBeSyndicated>yes</canBeSyndicated> 
    <webUrl>http://www.stack.com/cms/s/e8bec6dc-708d-11e7-aca6-c6bd07df1a3c.html</webUrl> 
</root>

在JSO的原始"bodyXML"內N，有HTML標籤的HTML文本，但是在轉換後它們被壓縮成HTML實體。我想要做的是在轉換後保留這些HTML標籤。

我該怎麼做？

幫助將不勝感激！

來源

2017-08-14 Bodz

你*肯定*這就是你想要做什麼？您所包含的HTML摘錄中的任何錯誤都會導致您的整個XML文件無法解析（實際上，在您的示例中，有一個關閉的''標籤沒有匹配的開始標記）。如何在解碼時重新構建HTML標籤？ – Phylogenesis

是的我確定我想保留HTML標記。 – Bodz

@Phylogenesis如何在解碼時改寫HTML標籤？ – Bodz

我不認爲它可能有「編碼」的HTML標籤中的XML節點

內文本，但它可能對你解析後的XML節點的內部文本做一個HTML解碼XmlDocument。

這將爲您帶來所有HTML標籤完好無損的文本。

例如，

private static string ConvertFileToXml() 
    { 
     string fileContent = File.ReadAllText("text.json"); 
     XmlDocument doc = JsonConvert.DeserializeXmlNode(fileContent, "root"); 
     return System.Web.HttpUtility.HtmlDecode(doc.SelectSingleNode("root").SelectSingleNode("bodyXML").InnerText); 
    }

命名空間要求：System.Web程序

來源

2017-08-14 09:30:54 Dinny

這工作，我適應新的代碼到我原來的帖子 – Bodz

不用擔心隊友！ – Dinny

保留HTML標記來XML轉換

回答

相關問題