如何在bash中的兩個字符串之間找到常用字符？

例如：如何在bash中的兩個字符串之間找到常用字符？

s1="my_foo" 
s2="not_my_bar"

期望的結果將是my_o。我如何在bash中做到這一點？

來源

2011-08-04 johannes

下劃線將是定界符？ – ajreal

不，事情是我想從s1和s2獲得所有常見字符 – johannes

任務的簡單性和shell腳本解決方案的複雜性之間的極大差異。非常好！ –

低於我的解決方案使用fold打破串入每行一個字符，sort的名單comm排序，以兩個字符串比較，最後tr到或者刪除新行字符

comm -12 <(fold -w1 <<< $s1 | sort -u) <(fold -w1 <<< $s2 | sort -u) | tr -d '\n'

，在這裏是一個純Bash解決方案（它也維護字符的順序）。它遍歷第一個字符串並檢查每個字符是否出現在第二個字符串中。

s="temp_foo_bar" 
t="temp_bar" 
i=0 
while [ $i -ne ${#s} ] 
do 
    c=${s:$i:1} 
    if [[ $result != *$c* && $t == *$c* ]] 
    then 
     result=$result$c 
    fi 
    ((i++)) 
done 
echo $result

打印：temp_bar

來源

2011-08-04 13:18:15 dogbane

是的，我也將-u添加到sort命令中。 –

很好地使用通配符：+1。 – jfg956

你的第二種方法的缺點是不能處理't'和's'中的空格。至少目前的形式。它也相當長。 –

應該是一個便攜的解決方案：

s1="my_foo" 
s2="my_bar" 
while [ -n "$s1" -a -n "$s2" ] 
do 
    if [ "${s1:0:1}" = "${s2:0:1}" ] 
    then 
     printf %s "${s1:0:1}" 
    else 
     break 
    fi 
    s1="${s1:1:${#s1}}" 
    s2="${s2:1:${#s2}}" 
done

來源

2011-08-04 13:20:36 l0b0

這隻能匹配兩個字符串中相同索引處的字符。所以如果你有'my_foo_bar'和'my_bar'，就不會工作。 – dogbane

假設字符串不包含嵌入的換行符：

s1='my_foo' s2='my_bar' 
intersect=$(
    comm -12 <(
    fold -w1 <<< "$s1" | 
     sort -u 
    ) <(
     fold -w1 <<< "$s2" | 
      sort -u 
     ) | 
      tr -d \\n 
      ) 

printf '%s\n' "$intersect"

而另一個問題：

tr -dc "$s2" <<< "$s1"

來源

2011-08-04 13:22:06

你使用'tr'的第二個解決方案很好，但不會刪除重複項。 – dogbane

@dogbane，好點！我應該提到這一點。要刪除重複項，這兩個值都應該通過'fold .. |排序..過濾器。 –

comm="" 
for ((i=0;i<${#s1};i++)) 
do 
    if test ${s1:$i:1} = ${s2:$i:1} 
    then 
    comm=${comm}${s1:$i:1} 
    fi 
done

來源

2011-08-04 13:25:54 ajreal

使用單個SED執行A液：

echo -e "$s1\n$s2" | sed -e 'N;s/^/\n/;:begin;s/\n\(.\)\(.*\)\n\(.*\)\1\(.*\)/\1\n\2\n\3\4/;t begin;s/\n.\(.*\)\n\(.*\)/\n\1\n\2/;t begin;s/\n\n.*//'

由於所有隱蔽sed腳本，它需要解釋在可由echo -e "$s1\n$s2" | sed -f script運行sed腳本文件的形式：

# Read the next line so s1 and s2 are in the pattern space only separated by a \n. 
N 
# Put a \n at the beginning of the pattern space. 
s/^/\n/ 
# During the script execution, the pattern space will contain <result so far>\n<what left of s1>\n<what left of s2>. 
:begin 
# If the 1st char of s1 is found in s2, remove it from s1 and s2, append it to the result and do this again until it fails. 
s/\n\(.\)\(.*\)\n\(.*\)\1\(.*\)/\1\n\2\n\3\4/ 
t begin 
# When previous substitution fails, remove 1st char of s1 and try again to find 1st char of S1 in s2. 
s/\n.\(.*\)\n\(.*\)/\n\1\n\2/ 
t begin 
# When previous substitution fails, s1 is empty so remove the \n and what is left of s2. 
s/\n\n.*//

如果要刪除重複項，請在腳本末尾添加以下內容：

:end;s/\(.\)\(.*\)\1/\1\2/;t end

編輯：我意識到dogbane的純殼解決方案具有相同的算法，並且可能更有效。

來源

2011-08-04 14:21:41 jfg956

遲進入，我剛剛找到了這個網頁：

echo "$str2" | 
    awk 'BEGIN{FS=""} 
    { n=0; while(n<=NF) { 
    if ($n == substr(test,n,1)) { if(!found[$n]) printf("%c",$n); found[$n]=1;} n++; 
    } print ""}' test="$str1"

和另外一個，這一個建立一個正則表達式匹配（注：不包含特殊字符的工作，但是這並不難與anonther sed的修復）

echo "$str1" | 
    grep -E -o ^`echo -n "$str2" | sed 's/\(.\)/(|\1/g'; echo "$str2" | sed 's/./)/g'`

來源

2011-08-07 13:47:12

使用'awk'的好主意，但不能用這個例子'awk'BEGIN {FS =「」} {n = 0; while（n <= NF）{if（$ n == substr（test，n，1））{printf（「％c」，$ n）;} n ++;} print「」}'test =「/ aa/ba /「<<<」/ aa/bb /「'。它顯示'/ aa/b /'而不是'/ aa/b'。請嘗試修復您的答案。乾杯 – olibre

@olibre：奇怪的報告:)修復它。 –

既然大家都喜歡的perl單行全標點符號：

perl -e '$a{$_}++ for split "",shift; $b{$_}++ for split "",shift; for (sort keys %a){print if defined $b{$_}}' my_foo not_my_bar

從輸入字符串創建散列%a和%b。
打印兩個字符串共有的任何字符。

輸出：

_moy

來源

2015-11-06 00:18:10

如何在bash中的兩個字符串之間找到常用字符？

回答

相關問題