正則表達式字重複

我需要的sed（僅sed的請），幫助我弄清楚如果某個詞出現3次在一個字那麼打印該行正則表達式...正則表達式字重複

可以說這是文件：

abc abc gh abc 
abcabc abc 
ab ab cd ab xx ab 
ababab cc ababab 
abab abab cd abab

所以輸出：

P1 F1

abc abc gh abc 
ab ab cd ab xx ab 
abab abab cd abab

這是我嘗試

sed -n '/\([^ ]\+\)[ ]+\1\1\1/p' $1

它不工作...：/我在做什麼錯誤？

這件事dosent如果字是在開始或沒有，他們不需要顯示爲序列

來源

2015-02-05 nick shmick

它看起來像你有很多功課...你已經問過[如何比較一行中的第一個單詞和最後一個單詞使用sed？]（http://stackoverflow.com/q/28318579/1983854），您是否使用Avinash的答案來獲得更好的嘗試？ – fedorqui 2015-02-05 14:15:48

我不明白你在問什麼@fedorqui – 2015-02-05 14:17:13

也重複單詞不需要第一個單詞在一行嗎？ – anubhava 2015-02-05 14:17:15

您需要添加.*其間\1

$ sed -n '/\b\([^ ]\+\)\b.*\b\1\b.*\b\1\b/p' file 
abc abc gh abc 
ab ab cd ab xx ab 
abab abab cd abab

我假設你輸入只包含空格和單詞字符。

來源

2015-02-05 14:15:22

awesome avinash thanks bro :) @Avinash Raj – 2015-02-05 15:11:36

我真的不明白\ b的語法......我的老師沒有解釋它，它看起來像這樣使得thigs更短，你能解釋嗎？ – 2015-02-05 15:14:58

'\ b'匹配單詞字符和非單詞字符。 'A-Z'或'a-z'或'0-9'或'_'。除這些字符以外的任何字符都稱爲非字字符。 – 2015-02-05 15:23:24

我知道它要求sed，但我已經與sed看到所有的系統也有awk，所以這裏是一個awk解決方案：

awk -F"[^[:alnum:]]" '{delete a;for (i=1;i<=NF;i++) a[$i]++;for (i in a) if (a[i]>2) {print $0;next}}' file 
abc abc gh abc 
ab ab cd ab xx ab 
abab abab cd abab

這可能是更容易理解比較正則表達式的解決方案。

awk -F"[^[:alnum:]]" # Set field separator to anything other than alpha and numerics characters. 
'{ 
delete a   # Delete array "a" 
for (i=1;i<=NF;i++) # Loop trough one by one word 
    a[$i]++   # Store number of hits of word in array "a" 
for (i in a)  # Loop trough the array "a" 
    if (a[i]>2) { # If one word is found more than two times: 
     print $0 # Print the line 
     next  # Skip to next line, so its not printed double if other word is found three times 
    } 
}' file    # Read the file

來源

2015-02-05 16:38:37 Jotne

正則表達式字重複

回答

相關問題