查找字符串中只出現一次的單詞

如何查找bash中字符串中沒有重複的單詞？我想知道是否有一個「本地」bash這樣做的方式，或者如果我需要使用另一個命令行實用程序（如awk，sed，grep，...）。例如，var1="thrice once twice twice thrice";。我需要一些能將「一次」這個詞分開的詞，因爲它只出現一次（即沒有重複）。查找字符串中只出現一次的單詞

來源

2014-02-08 BenjiWiebe

我會說沒有，有沒有簡單而優雅的方式。（雖然我準備證明是錯誤的一半，但這個網站很棒。） – tripleee

定義了「拆分」。 –

@KarolyHorvath'var1'將被閒置。我只需要以某種方式擁有獨特的單詞，所以我可以在腳本的其餘部分使用它。 – BenjiWiebe

您可以通過空格分割後的字符串使用sort，uniq：

tr ' ' '\n' <<< "$var1" | sort | uniq -u

這將產生once您的輸入。

（如果輸入包含標點符號，你可能會想，以避免意外的結果任何事情之前將其刪除。）

來源

2014-02-08 18:43:06 devnull

這正是我想出的:) –

這將與用戶名列表一起使用，所以不會有標點符號。 – BenjiWiebe

完美無缺！謝謝！ – BenjiWiebe

@ devnull的回答是更好的選擇（無論是簡單性和可能的表現），但如果你正在尋找一個慶典，唯一的解決辦法：

注意事項：

用途關聯數組，僅在bash 4或更高版本中可用：
在輸入單詞列表中使用文字*將不起作用（但其他類似glob的字符串也可以）。
正確處理多行輸入和輸入多個空白字符。詞之間。

# Define the input word list. 
# Bonus: multi-line input with multiple inter-word spaces. 
var1=$'thrice once twice twice thrice\ntwice again' 

# Declare associative array. 
declare -A wordCounts 

# Read all words and count the occurrence of each. 
while read -r w; do 
    [[ -n $w ]] && ((wordCounts[$w]+=1)) 
done <<<"${var1// /$'\n'}" # split input list into lines for easy parsing 

# Output result. 
# Note that the output list will NOT automatically be sorted, because the keys of an 
# associative array are not 'naturally sorted'; hence piping to `sort`. 
echo "Words that only occur once in '$var1':" 
echo "---" 
for w in "${!wordCounts[@]}"; do 
    ((wordCounts[$w] == 1)) && echo "$w" 
done | sort 

# Expected output: 
# again 
# once

來源

2014-02-08 19:03:42 mklement0

有趣。儘管如此，它不完全是我所說的*高雅的* ... – BenjiWiebe

同意 - 堅持@ devnull的解決方案，並把它當作bash的關聯數組的示範。 – mklement0

只是爲了好玩，AWK：

awk '{ 
    for (i=1; i<=NF; i++) c[$i]++ 
    for (word in c) if (c[word]==1) print word 
}' <<< "$var1"

once

來源

2014-02-08 21:50:44

+1;容易推廣到處理_multi-line_ input：'awk'{for（i = 1; i <= NF; i ++）c [$ i] ++; } END {for（word in c）if（c [word] == 1）print word}'<<<「$ var1」'。有一點需要注意：輸出單詞列表不會被排序。 – mklement0

查找字符串中只出現一次的單詞

回答

相關問題