限制XML/HTML字符串長度

所以我想解析一個XML文件並用READ MORE鏈接顯示文章的前150個單詞。儘管如此，它並不完全分析150個單詞。我也不知道如何製作它，所以它不解析IMG標籤代碼等...代碼是低於限制XML/HTML字符串長度

// Script displays 3 most recent blog posts from blog.pinchit.com (blog..pinchit.com/api/read) 
    // The entries on homepage show the first 150 words of description and "READ MORE" link 

    // PART 1 - PARSING 

    // if it was a JSON file 
    // $string=file_get_contents("http://blog.pinchit.com/api/read"); 
    // $json_a=json_decode($string,true); 
    // var_export($json_a); 


    // XML Parsing 
    $file = "http://blog.pinchit.com/api/read"; 
    $posts_to_display = 3; 
    $posts = array(); 

    // get all the file nodes 
    if(!$xml=simplexml_load_file($file)){ 
     trigger_error('Error reading XML file',E_USER_ERROR); 
    } 

    // counter for posts member array 
    $counter = 0; 

    // Accessing elements within an XML document that contain characters not permitted under PHP's naming convention 
    // (e.g. the hyphen) can be accomplished by encapsulating the element name within braces and the apostrophe. 

    foreach($xml->posts->post as $post){ 

     //post's title 
     $posts[$counter]['title'] = $post->{'regular-title'}; 

     // post's full body 
     $posts[$counter]['body'] = $post->{'regular-body'}; 

     // post's body's first 150 words 
     //for some reason, I am not sure if it's exactly 150 
     $posts[$counter]['preview'] = substr($posts[$counter]['body'], 0, 150); 

     //strip all the html tags so it doesn't mess up the page 
     $posts[$counter]['preview'] = strip_tags($posts[$counter]['preview']); 


     //post's id 
     $posts[$counter]['id'] = $post->attributes()->id; 


     $posts_to_display--; 
     $counter++; 
     //exit the for loop after we parse out all the articles that we want 
     if ($posts_to_display == 0) break; 
    } 

    // Displays all of the posts 

    foreach($posts as $post){ 

     echo "<b>" . $post['title'] . "</b>"; 
     echo "<br/>"; 
     echo $post['preview']; 
     echo " <a href='http://blog.pinchit.com/post/" . $post[id] . "'>Read More</a>"; 
     echo "<br/><br/>"; 

    }

以下是結果現在的樣子。

編輯推薦：俱樂部Sportiva的沒有什麼讓你感到完全免費，在控制爲圓滑，世故，性感的跑車方向盤後面的日子。這並不奇怪，更多

鉗子飲料&岩石：酒店猶他州轎車酒店猶他州閱讀更多

星期一菜單：辣柚子，辣椒，Creamsicles 今天感覺夏日可口的，我們必須承認這一點了很多東西都是爲了抵制讓所有開胃菜，所有甜點或所有飲品都成爲衝動的衝動。閱讀更多

來源

2011-08-01 CodeCrack

HTML標籤正在計入您的人物總數。剝去標籤先出來，然後把你的預覽樣本：

$preview = strip_tags($posts[$counter]['body']); 
$posts[$counter]['preview'] = substr($preview, 0, 150).'...';

此外，人們通常增加了一個橢圓形（「...」），以截斷文本的結尾以表明它仍在繼續。

請注意，這具有刪除您想要的標籤的潛在缺點，如<p>和<br>。如果您想保留這些，你可以將它們作爲第二個參數爲strip_tags：

$preview = strip_tags($posts[$counter]['body'], '<br><p>'); 
$posts[$counter]['preview'] = substr($preview, 0, 150).'...';

，但事先警告XML風格的標籤可能會引發這一關（<br />）。如果你正在處理混合的XML/HTML，你可能需要使用諸如htmLawed之類的東西來提升你的標籤過濾，但是這個概念保持不變 - 在截斷之前擺脫HTML。

來源

2011-08-01 20:20:07

啊是的..完全忘了去掉身體標籤的標籤。謝謝！ – CodeCrack

看着標籤<regular-body>它似乎包含HTML。因此，我建議嘗試將其解析爲DOMDocument（http://www.php.net/manual/en/domdocument.loadhtml.php）。然後，您可以遍歷所有項目並忽略某些標記（例如忽略<img>，但保留<p>）。之後，您可以渲染出您想要的內容並將其截斷爲150個字符。

來源

2011-08-01 20:20:14 afuzzyllama

限制XML/HTML字符串長度

回答

相關問題