2016-06-15 34 views
0

如果發現大寫字母,我需要一個可以拆分行的正則表達式。拆分行如果在行中發現大寫字母

例子: -

line1 = JOHN levin have fun RAJESH is a good person SAM was ok 

Exapecting輸出如下

line1 = JOHN levin have fun 
RAJESH is a good person 
SAM was ok. 
+2

[?你嘗試過什麼到目前爲止(http://whathaveyoutried.com) 請[編輯]您問題顯示代碼 的[mcve]表示您遇到問題,那麼我們可以嘗試幫助 解決具體問題。你還應該閱讀[問]。 –

+3

爲什麼'line1 ='&'John'之間沒有換行符? – anishsane

+0

嘗試'echo $ line | grep -Eo'[A-Z] + [^ A-Z] +''。 – blackSmith

回答

0

是你想要什麼?

$ line1='JOHN levin have fun RAJESH is a good person SAM was ok' 
$ sed 's/[A-Z]\+/\n&/g' <<< $line1 

JOHN levin have fun 
RAJESH is a good person 
SAM was ok 

請注意,換行符在JOHN之前添加,因爲它符合您的要求。避免這是另一個問題。您的要求是:

我需要一個正則表達式,如果發現大寫字母,它可以拆分行。

所以預期的輸出應該是:

$ sed 's/\([A-Z]\)/\n\1/g' <<< $line1 

J 
O 
H 
N levin have fun 
R 
A 
J 
E 
S 
H is a good person 
S 
A 
M was ok 
+0

預期的產量應該是OP所說的...... – 123

+0

@ 123好點 – nowox

0

請嘗試以下方法:

echo "<your string> | awk '{once_found = 0; for(i = 1; i < NF; i++){if($i ~/[A-Z]/){if(once_found){print "";} once_found++;} printf("%s ", $i);}print "";}' 

我已經把once_found省略line1 =John之間的換行符。我不確定你是否真的想要這樣。如果沒有,只是除去once_found和所有連接到它

1

此命令將分裂之前,從所述第二發生之前由空白開始大寫字母線(如在實施例):

sed 's/\(\s\)\([A-Z]\)/\1\n\2/g; s/\n//' 

實施例:

$ echo 'line1 = JOHN levin have fun RAJESH is a good person SAM was ok'|sed 's/\(\s\)\([A-Z]\)/\1\n\2/g; s/\n//' 
line1 = JOHN levin have fun 
RAJESH is a good person 
SAM was ok 
0

另一個gawk基礎的方法:

$ a='line1 = JOHN levin have fun RAJESH is a good person SAM was ok' 

$ awk '{ORS=((NR==1)?"":"\n")RT}1' RS='[A-Z]+' <<< "$a" 
line1 = JOHN levin have fun 
RAJESH is a good person 
SAM was ok 
  1. 拆分與RS=[A-Z]+
  2. 對於第1行,輸入使用ORS=RT,對於其他的線,使用ORS="\n"RT
  3. 打印

注意sed是做你正在嘗試做正確的工具。這個答案只是爲了說明。如果你需要任何複雜的算法,你可以這樣使用awk

0

使用與-E xtented正則表達式grep和-o這給只匹配:

$ line="JOHN levin have fun RAJESH is a good person SAM was ok" 
$ grep -oE '[A-Z]+[^A-Z]+?' <<< "$line" 
JOHN levin have fun 
RAJESH is a good person 
SAM was ok 
相關問題