如何將.docx中的換行符轉換爲HTML中的換行符使用PHP

我正在使用zip_read()讀取.docx文件，我意識到在.docx中，分頁符的編碼爲<w:br w:type="page"></w:br>。我想將它變成<br style="page-break-before: always">，這樣我就可以將它輸出到HTML中。我怎樣才能做到這一點？謝謝！如何將.docx中的換行符轉換爲HTML中的換行符使用PHP

來源

2017-10-19 user2335065

'str_replace'如何？ – rtfm

似乎不起作用 – user2335065

我們無法猜測您的代碼。 – rtfm

這是一個XML文件格式，這樣你就可以用DOM閱讀：

$xml = <<< WORD 
<?xml version="1.0" encoding="utf-8" standalone="yes"?> 
<?mso-application progid="Word.Document"?> 
<w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"> 
    <w:body> 
    <w:br w:type="page"></w:br> 
    </w:body> 
</w:wordDocument> 
WORD; 

libxml_use_internal_errors(true); 
$dom = new DOMDocument(); 
$dom->loadXML($xml); 
libxml_clear_errors(); 

$xpath = new DOMXPath($dom); 
$xpath->registerNamespace('w', 'http://schemas.microsoft.com/office/word/2003/wordml'); 

// find all the page breaks 
foreach ($xpath->evaluate('//w:br[@w:type="page"]') as $page_break) { 
    // create an html break element with some style attribute 
    $html_break = $dom->createElement('br'); 
    $html_break->setAttribute('style', 'page-break-before: always'); 
    // replace the page break with the html break in the document 
    $page_break->parentNode->replaceChild($html_break, $page_break); 
} 
echo $dom->saveHTML();

這將字分頁符HTML分頁符轉換的要求：

<?mso-application progid="Word.Document"><w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"> 
    <w:body> 
    <br style="page-break-before: always"> 
    </w:body> 
</w:wordDocument>

不是它鑑於xml這個單詞的其餘部分仍然保持原樣，它會很有意義。但這就是你如何使用XML解析器來處理它。

請確保檢出https://github.com/PHPOffice/PHPWord以及

來源

2017-10-19 06:06:39 Gordon

如何將.docx中的換行符轉換爲HTML中的換行符使用PHP

回答

相關問題