awk中能做到這一點,看看這個單行R:
awk -F'[}{]' '{split($2,a,",");delete(b);for(x in a)b[a[x]]}length(b)<=2' file
讓我們做一個小測試:
kent$ cat file
ok,{(XXX1),(XXX2)},whatever,unique=2
ok,{(XXX1),(XXX1),(XXX1),(XXX2)},whatever,unique=2
ok,{(XXX1)},whatever,unique=1
ok,{},whatever,unique=0
nok,{(XXX1),(XXX2),(XXX3),(XXX4)},whatever
kent$ awk -F'[}{]' '{split($2,a,",");delete(b);for(x in a)b[a[x]]}length(b)<=2' file
ok,{(XXX1),(XXX2)},whatever,unique=2
ok,{(XXX1),(XXX1),(XXX1),(XXX2)},whatever,unique=2
ok,{(XXX1)},whatever,unique=1
ok,{},whatever,unique=0
你可以看到,nok
線取出
編輯
awk -F'[}{]' '{gsub(/[()]/,"");split($2,a,",");delete(b);for(x in a)b[a[x]];l=length(b)}l<=2&&l>0{s="";for(x in b)s=s""x",";sub(/,$/,"",s);y[s]=s $3}END{for(x in y)print y[x]}' file
測試
kent$ cat file
{(XXX1),(XXX2)},whatever,unique=2
{(XXX1),(XXX1),(XXX1),(XXX2)},whatever,unique=2
{(XXX1)},whatever,unique=1
{},whatever,unique=0
{(XXX1),(XXX2),(XXX3),(XXX4)},whatever
kent$ awk -F'[}{]' '{gsub(/[()]/,"");split($2,a,",");delete(b);for(x in a)b[a[x]];l=length(b)}l<=2&&l>0{s="";for(x in b)s=s""x",";sub(/,$/,"",s);y[s]=s $3}END{for(x in y)print y[x]}' file
XXX1,XXX2,whatever,unique=2
XXX1,whatever,unique=1
您不能(可靠地)使用grep處理csv數據,因爲cvs條目可以跨越多行。即使你沒有其中的任何一個,grep可能不會決定某個給定的逗號(或任何分隔符是否)在一個條目中或將它們分開。 – 2013-03-08 14:59:27
很好,謝謝!那麼,你會推薦我什麼? – user706838 2013-03-08 15:03:57
有http://www.aboutwilson.net/csvgrep/,但我沒有看到它可以做什麼,不能做什麼。除此之外,使用一些適當的CSV解析器和寫入器/串行器,並在其上實現邏輯。就個人而言,我會檢查紅寶石,但語言的選擇可能取決於你所知道的。 – 2013-03-08 15:08:04