File1中
$ cat file1
A B
hello 0.5
bye 0.4
文件2
$ cat file2
C D
hello 1
country 5
輸出
$ awk 'NR==1{print "Text","B","D"}FNR==1{next}FNR==NR{A[$1]=$2;next}{print $0,(f=$1 in A ? A[$1] : ""; if(f)delete A[$1]}END{for(i in A)print i,"",A[i]}' OFS='\t' file2 file1
Text B D
hello 0.5 1
bye 0.4
country 5
更好的閱讀的版本
awk '
# Print header when NR = 1, this happens only when awk reads first file
NR==1{print "Text","B","D"}
# Number of Records relative to the current input file.
# When awk reads from the multiple input file,
# awk NR variable will give the total number of records relative to all the input file.
# Awk FNR will give you number of records for each input file
# So when awk reads first line, stop processing and go to next line
# this is just to skip header from each input file
FNR==1{
next
}
# FNR==NR is only true while reading first file (file2)
FNR==NR{
# Build assicioative array on the first column of the file
# where array element is second column
A[$1]=$2
# Skip all proceeding blocks and process next line
next
}
{
# Check index ($1 = column1) from second argument (file1) exists in array A
# if exists variable f will be 1 (true) otherwise 0 (false)
# As long as above state is true
# print current line and element of array A where index is column1
print $0,(f=$1 in A ? A[$1] : "")
# Delete array element corresponding to index $1, if f is true
if(f)delete A[$1]
}
# Finally in END block print array elements one by one,
# from file2 which does not exists in file1
END{
for(i in A)
print i,"",A[i]
}
' OFS='\t' file2 file1
是您的樣本文件1真的這樣呢?標籤在哪裏?你爲什麼在'-k2'上排序,但是使用'-j 1'來加入?另外請注意'man join'中的'-e'選項可能有助於找到不匹配的項目。祝你好運。 – shellter
這種爲我工作。 'join -t $'\ t'-1 1 -2 1 <(sort -k1 file1.tsv)<(sort -k1 file2.tsv)> join_test.tsv'我遇到的主要問題是定義了tab分隔符。 – jxn
良好的接觸和抱歉,我錯過了這一關鍵點。我很高興你有一個解決方案。對於那些已經發布可用解決方案的人來說,它絕不會感到痛苦。它給人們激勵分享他們所知道的東西。祝你們好運。 – shellter