awk中的文件比較

其具有組件的名稱和版本號由空格分隔兩個文件：awk中的文件比較

cat file1 
com.acc.invm:FNS_PROD 94.0.5 
com.acc.invm:FNS_TEST_DCCC_Mangment 94.1.6 
com.acc.invm:FNS_APIPlat_BDMap 100.0.9 
com.acc.invm:SendEmail 29.6.113 
com.acc.invm:SendSms 12.23.65 
com.acc.invm:newSer 10.10.10 

cat file2 
com.acc.invm:FNS_PROD 94.0.5 
com.acc.invm:FNS_TEST_DCCC_Mangment 94.0.6 
com.acc.invm:FNS_APIPlat_BDMap 100.0.10 
com.acc.invm:SendEmail 29.60.113 
com.acc.invm:SendSms 133.28.65 
com.acc.invm:distri_cob 110.10.10

所需輸出爲：

（1）從文件1的組件，這是在文件1和不列表存在於file2中。
（2）來自file2的組件列表，它們位於file1中，而不在file2中。

在本例中所期望的輸出是：

從file1的組件：

com.acc.invm:newSer 10.10.10

從file2的組件：

com.acc.invm:distri_cob 110.10.10

注意：我們必須忽略如果組分存在具有不同版。

我的代碼是：（1）

cat new.awk 
{ split($2,a,/\./); curr = a[1]*10000 + a[2]*100 + a[3] } 
NR==FNR { prev[$1] = curr; next } 
!($1 in prev) && (curr > prev[$1]) 

/usr/bin/nawk -f new.awk f2 f1

OUTPUT

com.acc.invm:newSer 10.10.10

（2）

/usr/bin/nawk -f new.awk f1 f2

OUTPUT

com.acc.invm:distri_cob 110.10.10

這個邏輯是正確的嗎？ AND

任何人都可以幫助我如何在腳本本身中編寫new.awk，因此不應該需要new.awk文件來運行它。

來源

2015-10-19 rKSH

您可以使用awk的一次調用打印兩個文件的獨特成分：

# Save all the components from the first file into an array 
NR == FNR { a[$1] = $0; next } 

# If a component from the second file is found, delete it from the array 
$1 in a { delete a[$1]; next } 

# If a component in the second file is not found, print it 
{ print } 

# Print all the components from the first file that weren't in the second 
END { for (i in a) print a[i] } 


$ cat file1 
com.acc.invm:FNS_PROD 94.0.5 
com.acc.invm:FNS_TEST_DCCC_Mangment 94.1.6 
com.acc.invm:FNS_APIPlat_BDMap 100.0.9 
com.acc.invm:SendEmail 29.6.113 
com.acc.invm:SendSms 12.23.65 
com.acc.invm:newSer 10.10.10 


$ cat file2 
com.acc.invm:FNS_PROD 94.0.5 
com.acc.invm:FNS_TEST_DCCC_Mangment 94.0.6 
com.acc.invm:FNS_APIPlat_BDMap 100.0.10 
com.acc.invm:SendEmail 29.60.113 
com.acc.invm:SendSms 133.28.65 
com.acc.invm:distri_cob 110.10.10 


$ awk -f cf.awk file2 file1 
com.acc.invm:newSer 10.10.10 
com.acc.invm:distri_cob 110.10.10

對於你的問題的第二部分，如果你想在不需要的代碼運行此獨立AWK文件，你可以有內聯代碼如下所示：

awk 'NR==FNR {a[$1]=$0; next} $1 in a {delete a[$1]; next}1 END {for (i in a) print a[i]}' file2 file1

（注意END前1是一樣的有{ print }，因爲1總是真實print是默認操作。）

來源

2015-10-19 12:56:12 jas

我可以建議一個簡單的一行代碼做相同但沒有awk編程嗎？

cat file2 file1 file2|cut -f 1 -d" "|sort|uniq -u| xargs -I'{}' grep '{}' file1 
com.acc.invm:newSer 10.10.10 


cat file1 file2 file1|cut -f 1 -d" "|sort|uniq -u| xargs -I'{}' grep '{}' file2 
com.acc.invm:distri_cob 110.10.10

來源

2015-10-19 12:43:45 LiMar

如果您只需要組件名稱（無版本）

$ p() { cut -d' ' -f1 $1 | sort; }; comm -23 <(p file1) <(p file2) 
com.acc.invm:newSer 

$ p() { cut -d' ' -f1 $1 | sort; }; comm -13 <(p file1) <(p file2) 
com.acc.invm:distri_cob

，如果你需要的版本號，你可以管

... | xargs -I{} grep {} file2

類似於file1，就像@LiMar的解決方案

來源

2015-10-19 14:56:16 karakfa

awk中的文件比較

回答

相關問題