2013-10-15 102 views
0

我想從字符串中找到段落,並對它們進行格式化,我有什麼樣的作品,但它不能100%地工作。將多個新行轉換爲段落

所以,我有這樣的字符串,它看起來像這樣:

##Chapter 1 

Once upon a time there was a little girl named sally, she went to school. 

One day it was awesome! 

##Chapter 2 

We all had a parade! 

我格式化字符串,通過轉換##...<H2>的,現在看起來是這樣的:

<h2>Chapter 1</h2> 

Once upon a time there was a little girl named sally, she went to school. 

One day it was awesome! 

<h2>Chapter 2</h2> 

We all had a parade! 

現在我想將所有內容都轉換爲段落,並且要這樣做:

// Converts sections to paragraphs: 
$this->string = preg_replace("/(^|\n\n)(.+?)(\n\n|$)/", "<p>$2</p>", $this->string); 

// To Remove paragraph tags from header tags (h1,h2,h3,h4,h5,h6,h7): 
$this->string = preg_replace("/<p><h(\d)>(.+?)<\/h\d><\/p>/i", "<h$1>$2</h$1>", $this->string); 

這是最終輸出(新線,以提高可讀性):

<h2>Chapter 1</h2> 
Once upon a time there was a little girl named sally, she went to school. 
<p>One day it was awesome!</p> 
<h2>Chapter 2</h2> 
<p>We all had a parade!</p> 

當我接近開頭所說,這是行不通100%,你可以看到一個段落未添加到第一段。我能做些什麼來改善正則表達式?

回答

1

你能做到一步到位:

$this->string = preg_replace('~(*BSR_ANYCRLF)\R\R\K(?>[^<\r\n]++|<(?!h[1-6]\b)|\R(?!\R))+(?=\R\R|$)~u', 
          '<p>$0</p>', $this->string); 

圖案的詳細資料

(*BSR_ANYCRLF)  # \R can be any type of newline 
\R\R     # two newlines 
\K     # reset the match 
(?>     # open an atomic group 
    [^<\r\n]++  # all characters except <, CR, LF 
    |     # OR 
    <(?!h[1-6]\b) # < not followed by a header tag 
    |     # OR 
    \R(?!\R)   # single newline 
)+     # close the atomic group and repeat one or more times 
(?=\R\R|$)   # followed by to newlines or the end of the string 
+0

太棒了,這就像一個魅力工作! –

0

加入M切換到第一個正則表達式。

// Converts sections to paragraphs: 
$this->string = preg_replace("/(^|\n\n)(.+?)(\n\n|$)/m", "<p>$2</p>", $this->string); 

// To Remove paragraph tags from header tags (h1,h2,h3,h4,h5,h6,h7): 
$this->string = preg_replace("/<p><h(\d)>(.+?)<\/h\d><\/p>/i", "<h$1>$2</h$1>", $this->string); 
+0

我試過了,它不起作用,它會將段落添加到所有內容中。 –