2013-06-20 84 views
0

我想寫一個正則表達式將匹配這些條件:正則表達式 - 匹配和提取複雜條件

最大的8000
  • 字符(任何字符,包括「\ r \ n」)
  • 最多10行(用\ r \ n分隔)。
  • 從匹配文本中提取只有前4行

無法找到一個好辦法做到這一點...:/

謝謝!

+0

有辦法用正則表達式來做到這一點,但他們都不是好** **方式 –

+0

哪種語言/你正在使用的工具?另外,您希望提取前4行 - 如果行數少於4行,會發生什麼情況? – ridgerunner

回答

1

正則表達式不是你所需要的。它們用於匹配某個模式,而不是一定的長度。如果您要將數據保存在string中,則需要使用myString.length <= 8000(對於您的語言,使用正確的語法)。對於行數,您必須計算字符串中\r\n序列的數量(可以迭代完成)。要獲得前四行,只需找到4th \r\n,然後使用substring方法獲取所有內容。

+0

-1錯誤信息。在「。{0,8000}」這樣的正則表達式中使用重複表達式可以讓您匹配0到8000個字符。如果你想匹配一個確切數量的字符,那麼你可以使用'。{8000}'。但不要聽我說,你可以在這裏閱讀更多關於http://www.regular-expressions.info/repeat.html。 –

1

說明

該表達式執行以下操作:

  • 驗證輸入字符串是零和字符之間8000
  • 驗證有至多10行新行的分隔文本
  • 然後捕獲文本的前4行新分界線

\A(?=.{0,8000}\Z)(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)(?:^.*?[\r\n\Z]+){0,4}這就需要選擇:m多,和s點的所有字符

enter image description here

擴展

  • \A錨字符串的開頭相匹配,這種定位的允許使用s選擇的這允許.匹配新的換行符和換行符
  • (?=.{0,8000}\Z)展望未來並驗證介於零和8000個字符
  • (?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)向前看,驗證有沒有更多然後10個新行分隔行是
  • (?:^.*?[\r\n\Z]+){0,4}比賽第4行文字

PHP代碼示例:

沒有指定一種語言,所以我將包含這個PHP示例來展示它如何工作和示例輸出。

輸入文本

該輸入測試是8行的新線分隔的字符串。這裏只有1779個字符。

Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small 
river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about 
the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were 
thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of 
the Italic Mountains, she had a last view back on the skyline of her hometown Bookmarksgrove, the headline of Alphabet Village and the subline of her own road, the Line Lane. Pityful a rethoric question ran over her cheek, then 
she continued her way. On her way she met a copy. The copy warned the Little Blind Text, that where it came from it would have been rewritten a thousand times and everything that was left from its origin would be the word "and" 
and the Little Blind Text should turn around and return to its own, safe country. But nothing the copy said could convince her and so it didn’t take long until a few insidious Copy Writers ambushed her, made her drunk with Longe 
and Parole and dragged her into their agency, where they abused her for their projects again and again. And if she hasn’t been rewritten, then they are still using her. 

代碼

<?php 
$sourcestring="your source string"; 
preg_match('/\A(?=.{0,8000}\Z)(?=(?:^.*?(?:\r|\n|\Z)){0,10}\Z)(?:^.*?[\r|\n\Z]+){0,4}/ims',$sourcestring,$matches); 
echo "<pre>".print_r($matches,true); 
?> 

匹配

$matches Array: 
(
    [0] => Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small 
river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about 
the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were 
thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of 

)