我使用段落格式的文本,日期始終位於每個段落文章的上方。問題是在每篇文章之後,有不同種類的unicode換行符都有未知的換行符。我需要刪除每個段落之間換行符的每個實例,並用兩個\n\n
替換它。替換常規換行符和統一碼換行符
所以從這個
05/12
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
11/01
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
對此
05/12
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
11/01
The 1959 Mexico hurricane was a devastating tropical cyclone
that was one of the worst ever Pacific hurricanes. It
impacted the Pacific coast of Mexico in October 1959. The
hurricane killed at least 1,000 people.
我嘗試使用preg_replace()
但它不是每個實例匹配?
$text = preg_replace('/\r?\n+(?=\d{2}\/\d{2})/', "\n\n", $text);
也許你需要嘗試匹配代表'換行符'的所有Unicode字符?我知道另一個在一週前搞砸了我的文本標記器 - 回車'\ r'。這只是一個提示,雖然...... **劃痕,看起來像你匹配'\ r'。 –