2011-01-09 80 views
3

任務是將字符串拆分爲500個字符。我用str_split完成了這個,但是我遇到了一個問題。當然,它必須用文字吐出來,否則這個文本是不可讀的。還有更多。這個文本帶有鏈接,如果我將它們分開,鏈接將被打破(實際上是任何html)=)所以我只有在標籤結束或者甚至還沒開始時才需要開始分割......這同樣適用於單詞。 ±100個字符不是問題。PHP:準確地將字符串和標籤拆分爲數組

我真的很感激一段代碼來做到這一點。正則表達式我不太好。

編輯:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac diam non nisl interdum tempus. Nam id ipsum id nunc tempus varius. Suspendisse ut neque a velit elementum placerat. Curabitur lobortis, lorem sit <a href="#">amet tincidunt ultricies,</a> eros ante feugiat dui, sit amet lacinia metus risus a magna. Duis velit dui, sollicitudin at aliquet et, elementum at dui. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae;

腳本:

<?php 

$str = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. <a href=\"http://example.com\">Phasellus condimentum 
facilisis ipsum</a>, quis elementum urna ornare non. Cras nisi libero, dapibus sed euismod id, pharetra eu libero. 
Maecenas mi nulla, ultrices in congue in, viverra ac massa. Quisque <br/>at turpis nulla. Suspendisse semper urna eu 
augue aliquet dictum. Mauris at purus in lectus varius bibendum. <em>Fusce hendrerit <strong>posuere ante</strong></em>, 
at pellentesque odio lobortis at. Integer quis urna eget ipsum dictum volutpat quis et leo. Etiam hendrerit eleifend 
ornare. Phasellus eget justo elit."; 

$str = str_split($str, 200); 

var_dump($str); 

輸出:

array(4) { 
    [0]=> 
    string(200) "Lorem ipsum dolor sit amet, consectetur adipiscing elit. <a href="http://example.com">Phasellus condimentum 
facilisis ipsum</a>, quis elementum urna ornare non. Cras nisi libero, dapibus sed euismod " 
    [1]=> 
    string(200) "id, pharetra eu libero. 
Maecenas mi nulla, ultrices in congue in, viverra ac massa. Quisque <br/>at turpis nulla. Suspendisse semper urna eu 
augue aliquet dictum. Mauris at purus in lectus varius bi" 
    [2]=> 
    string(200) "bendum. <em>Fusce hendrerit <strong>posuere ante</strong></em>, 
at pellentesque odio lobortis at. Integer quis urna eget ipsum dictum volutpat quis et leo. Etiam hendrerit eleifend 
ornare. Phasellus" 
    [3]=> 
    string(17) " eget justo elit." 
} 

這是一個惡劣的性格分裂,文字的一半來自於$海峽[ 1]。如果這個地方是一個鏈接,它會被破壞。

+0

您是否嘗試過爆炸(」」,$字符串)? – Aston 2011-01-09 11:02:57

+0

我真的很感激一些示例數據:) – 2011-01-09 11:03:09

回答

1

最好不要用正則表達式來完成,而要用PHP的原生XML/HTML解析功能。如下面的代碼的東西可能做你想要什麼:

<?php 

$str = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. <a href=\"http://example.com\">Phasellus condimentum facilisis ipsum</a>, quis elementum urna ornare non. Cras nisi libero, dapibus sed euismod id, pharetra eu libero. Maecenas mi nulla, ultrices in congue in, viverra ac massa. Quisque <br/>at turpis nulla. Suspendisse semper urna eu augue aliquet dictum. Mauris at purus in lectus varius bibendum. <em>Fusce hendrerit <strong>posuere ante</strong></em>, at pellentesque odio lobortis at. Integer quis urna eget ipsum dictum volutpat quis et leo. Etiam hendrerit eleifend ornare. Phasellus eget justo elit."; 

$dom = new DOMDocument; 

$root = $dom->createDocumentFragment(); 
$root->appendXML($str); 

$bits = array(); 

foreach ($root->childNodes as $node) { 
    if ($node->nodeType == XML_TEXT_NODE) { 
     $bits = array_merge($bits, explode(' ', $node->nodeValue)); 
    } elseif ($node->nodeType == XML_ELEMENT_NODE) { 
     $dom->appendChild($newnode = $node->cloneNode(true)); 
     $bits[] = $dom->saveHTML(); 
     $dom->removeChild($newnode); 
    } 
} 

var_dump($bits);