2017-07-23 55 views
0

我有一段文本,其中有多段由各種長度的虛線分開。我想用python來匹配段落之間的界限。我的要求是如下:匹配各種長度的刺激

  1. 匹配線僅包含虛線一個不同長度的線
  2. 線包括破折號和任何其它字符(多個)被排除

這裏是一個樣本文本塊:

Believing neglected so so allowance existence departure in. 
In design active temper be uneasy. Thirty for remove plenty 
regard you summer though. He preference connection astonished 
on of yet. ------ Partiality on or continuing in particular principles as. 
Do believing oh disposing to supported allowance we. 
------- 
Admiration we surrounded possession frequently he. 
Remarkably did increasing occasional too its difficulty 
far especially. Known tiled but sorry joy balls. Bed sudden 

manner indeed fat now feebly. Face do with in need of 
wife paid that be. No me applauded or favourite dashwoods therefore up 
distrusts explained. 
----t-- 
------ 
And produce say the ten moments parties. Simple innate summer 
fat appear basket his desire joy. Outward clothes promise at gravity 
do excited. 
Sufficient particular impossible by reasonable oh expression is. Yet 
preference 
connection unpleasant yet melancholy but end appearance. And 
excellence partiality 
estimating terminated day everything. 
---------  

我已經試過如下:

r"-*.-"g or (.*?)-+ 

但是,我匹配所有包含兩個或更多破折號的行,包括那些容器中的其他字符。

+0

可以通過'CHAR匹配特定長度的東西{MINLENGTH,MAXLENGTH}'或'CHAR {LENGTH}' – Luke

+1

你總是可以使用'^(M +) - + [^ \ S \ r \正] * $' – sln

回答

1

只需r"^[-]+$"應該工作。只要記得指定MULTILINE模式爲^$分別匹配行的開始和行的結尾,而不僅僅是整個字符串的開始和結束。

實際上最後一行不匹配,因爲它最後有空格。如果在破折號之後允許空格,則可以使用r"^[-]+[ ]*$"

另一件事 - 如果你也想只匹配的段落,而不是在最後,你可以使用之間的界限r"^[-]+[ ]*$[^\Z]"

編輯:從@ SLN的評論採取這裏的一些細微差別,我忘了:

  1. 您可以通過在模式
  2. 的字符類[^\S\r\n]匹配所有空格除換行符的開始使用(?m)設置MULTILINE標誌。您可以使用它而不是[ ],它僅匹配空格。