2010-11-27 62 views
0
$content = '<!--<sup><span style="font-weight:bold;color:black;">0</span></sup><br/>--> 
    <div class="popular-video-image"> 
     <a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>"> 
      <img src="/images/topvideo/1.jpg" alt=""/> 
     </a> 
     <span class="popular-video-artist ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Far East Movement</a></span> 
     <span class="popular-video-title ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Like a G6</a></span> 
    </div>'; 

    $dom = new DOMDocument; 
    $dom->preserveWhiteSpace = false; 
    $dom->loadHTML($content); 
    foreach ($dom->getElementsByTagName('a') as $node) 
    { 
     $node->setAttribute('href', 'http://mysite.ru/' . $node->getAttribute('href')); 
    } 
    $dom->formatOutput = true; 

    echo $dom->saveXml($dom->documentElement); 

輸出:PHP的DOMDocument呼應問題

<html> 
    <body> 
    <div class="popular-video-image">&#13; 
     <a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="&lt;lang video_go_to=Far East Movement - Like a G6&gt;">&#13; 
      <img src="/images/topvideo/1.jpg" alt=""/></a>&#13; 
     <span class="popular-video-artist ellipsis"><a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="&lt;lang video_go_to=Far East Movement - Like a G6&gt;" class="ellipsis">Far East Movement</a></span>&#13; 
     <span class="popular-video-title ellipsis"><a href="http://mysite.ru/video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="&lt;lang video_go_to=Far East Movement - Like a G6&gt;" class="ellipsis">Like a G6</a></span>&#13; 
    </div> 

    </body> 
</html> 

我不想添加HTML和body標籤。也不想將標籤替換爲&lt;lang&gt;And &#13;也是不必要的。

我希望收到此類內容,這是在入口處,只有修改的鏈接..

對不起,我的英語不好!

回答

0

我想<html><body>標籤會被放入,因爲您使用的是loadHTML。請嘗試使用loadXML

至於&lt;lang&gt;,它被替換,因爲否則結果XML將無效。如果它導致你的問題,你應該改變你的方法一點,並與它合作,而不是反對它。

+0

的loadXML不顯示除錯誤什麼 – Isis 2010-11-27 12:17:54

+0

好吧,也許錯誤可以固定? :) – Jon 2010-11-27 12:19:44

3

saveXml採用可選參數來指定要輸出的節點。

$dom->saveXml($dom->documentElement->firstChild->firstChild); 

這將從輸出中刪除html和body標籤。

0
<?php 
    $content = '<!--<sup><span style="font-weight:bold;color:black;">0</span></sup><br/>--> 
    <div class="popular-video-image"> 
     <a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>"> 
      <img src="/images/topvideo/1.jpg" alt=""/> 
     </a> 
     <span class="popular-video-artist ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Far East Movement</a></span> 
     <span class="popular-video-title ellipsis"><a href="video/Far+East+Movement - Like+a+G6/w4s6H4ku6ZY/" title="<lang video_go_to=Far East Movement - Like a G6>" class="ellipsis">Like a G6</a></span> 
    </div>'; 

    $dom = new DOMDocument; 
    $dom->preserveWhiteSpace = false; 
    $dom->loadHTML($content); 
    foreach ($dom->getElementsByTagName('a') as $node) 
    { 
     $node->setAttribute('href', 'http://mysite.ru/' . $node->getAttribute('href')); 
    } 
    $dom->formatOutput = true; 

    echo preg_replace('#^<!DOCTYPE.+?>#', '', str_replace(array('<html>', '</html>', '<body>', '</body>', "\n\n", '&lt;', '&gt;'), array('', '', '', '', '', '<', '>',), $dom->saveHTML())); 
4

你在每一行的末尾看到&#13;,因爲你的HTML有Windows-style line endingsCR+LF。要擺脫他們,就可以運行這個你給它變成DOMDocument —之前,將它們轉換成Unix風格的行結束LF

$content = preg_replace('/\r\n/', "\n", $content);