Powershell - xml - 優文庫

我有一個輸入XML文件，其中包含各種字符的常規HTML名稱，例如Powershell - xml

<?xml version="1.0" encoding="UTF-8"?> 
<OrganisationUnits> 
    <OrganisationUnitsRow num="8"> 
    <OrganisationId>ACME24/7HOME</OrganisationId> 
    <OrganisationName>ACME LTD</OrganisationName> 
    <Notes>Double Quote &quot; Single Quote &pos; Ampersand &amp; </Notes> 
    <Sector>P</Sector> 
    <SectorDesc>Private Private &amp; Voluntary</SectorDesc> 
    </OrganisationUnitsRow> 
</OrganisationUnits>

後

<?xml version="1.0" encoding="UTF-8"?> 
<OrganisationUnits> 
    <OrganisationUnitsRow num="8"> 
    <OrganisationId>ACME24/7HOME</OrganisationId> 
    <OrganisationName>ACME LTD</OrganisationName> 
    <Notes>Double Quote " Single Quote ' Ampersand &</Notes> 
    <Sector>P</Sector> 
    <SectorDesc>Private Private & Voluntary</SectorDesc> 
    </OrganisationUnitsRow> 
</OrganisationUnits>

我處理該文件作爲XML並得到處理好了，沒有什麼很花哨的雙引號= "等

<Notes>Double Quote &quot; Single Quote &pos; Ampersand &amp;</Notes>

之前。

$xml = [xml](Get-Content $path\$File) 
foreach ($CMCAddressesRow in $xml.OrganisationUnits.OrganisationUnitsRow) { 
    blah 
    blah 
} 
$xml.Save("$path\$File")

當輸出保存所有喜歡"的HTML代碼得到由"取代。如何保留原始HTML "個字符？更重要的是它爲什麼會發生。

來源

2017-05-24 zoomzoomvince

該XML文件的第6行是否有''？ – lit

System.Net.WebUtility.HtmlDecode和System.Net.WebUtility.HtmlEncode – jdweng

當文件被讀作[xml]時，看起來已經發生了"的替換。 – lit

你指的是所謂的「字符實體」。 PowerShell在導入時轉換它們，因此您可以使用這些實體表示的實際字符，並在導出時僅轉換必須在XML文件中編碼的內容。引號字符不需要在節點值中編碼，所以它們在導出時不會被編碼。

來源

2017-05-25 08:11:15

抱歉，錯字 - 我的錯。我發現通過使用[System.Security.SecurityElement] :: Escape（$ var），我可以做我正在尋找的東西。（？=。*？）'，Foreach-Object {$ _ -replace'（？is）（？<=。*？） $ dbl ='「' （Get-Content $ path \ $ InFile） [System.Security.SecurityElement] :: Escape（$ dbl）} |設置內容$ path \ $ InFile 這將採取「雙引號並用"替換它 - 非常粗糙，但我相信它可以改進 – zoomzoomvince

事實上，這些示例可以在[W3C標記驗證服務] （https://validator.w3.org/#validate_by_input）。 – JosefZ

Powershell - xml

回答

相關問題