想打印基於第2列&第4列,行項目數,第3列,第一列的唯一值awk來計數總和和獨特的提高指揮 - 缺點:
的總和Input.csv
abc,xx,5,Jan-2014
abc,yy,10,Jan-2014
def,xx,15,Jan-2014
def,yy,20,Jan-2014
abc,xx,5,Jan-2014
abc,yy,10,Jan-2014
def,xx,15,Jan-2014
def,yy,20,Jan-2014
ghi,zz,10,Jan-2014
abc,xx,5,Feb-2014
abc,yy,10,Feb-2014
def,xx,15,Feb-2014
def,yy,20,Feb-2014
abc,xx,5,Feb-2014
abc,yy,10,Feb-2014
def,xx,15,Feb-2014
def,yy,20,Feb-2014
ghi,zz,10,Feb-2014
嘗試#1:
awk '
BEGIN { FS = OFS = "," }
{ keys=$2","$4;keys[$2][$4]++; sum[$2]+=$3 } !seen[$1,$2,$4]++ { count[$2]++ }
END { for(key in keys) print key, keys[key], sum[key], count[key] }
' Input.csv
嘗試#2:
awk '
BEGIN { FS = OFS = "," }
{ keys=[$2][$4];keys[$2][$4]++; sum[$2]+=$3 } !seen[$1,$2,$4]++ { count[$2]++ }
END { for(key in keys) print key, keys[key], sum[key], count[key] }
' Input.csv
嘗試3:
awk '
BEGIN { FS = OFS = "," }
{ keys=[$2,$4];keys[$2][$4]++; sum[$2]+=$3 } !seen[$1,$2,$4]++ { count[$2]++ }
END { for(key in keys) print key, keys[key], sum[key], count[key] }
' Input.csv
所需的輸出:
xx,Jan-2014,4,40,2
yy,Jan-2014,4,60,2
zz,Jan-2014,1,10,1
xx,Feb-2014,4,40,2
yy,Feb-2014,4,60,2
zz,Feb-2014,1,10,1
尋找您的建議!
@EdMorton是的好建議。使用由'SUBSEP'分隔的組合鍵總是爲我完成了這項工作,所以我從來沒有真正使用'awk'中的真正的多維數組。我想我應該開始閱讀它。 ':)' – 2014-09-05 16:12:16
二維數組對於許多常見應用程序非常有用。一個應用程序,例如在一個僞二維數組的多行上讀取一個公共密鑰的值:'a [$ 1] =($ 1在$ a [$ 1] RS:「」)$ 2; ... END {for(key in a){split(a [key],tmp,RS); for(i = 1; i in tmp; i ++){val = tmp [i]; ('key',val}}}'對於真2維數組變得更簡單:'a [$ 1] [$ 2]; ... END {for(key in a)for(val in [key])print key, VAL}'。在那裏可能有語法錯誤或2,但你明白了。 – 2014-09-05 16:18:14
讓我們[在聊天中繼續討論](http://chat.stackoverflow.com/rooms/60716/discussion-between-avn-and-jaypal)。 – VNA 2014-09-05 17:21:06