我正在嘗試查找我的數據有重複行並刪除重複行的所有位置。此外,我正在尋找第二列的值爲90,並用指定的特定號碼替換下面的第二列。查找特定的列,並用gawk替換具有特定值的以下列
我的數據是這樣的:
# Type Response Acc RT Offset
1 70 0 0 0.0000 57850
2 31 0 0 0.0000 59371
3 41 0 0 0.0000 60909
4 70 0 0 0.0000 61478
5 31 0 0 0.0000 62999
6 41 0 0 0.0000 64537
7 41 0 0 0.0000 64537
8 70 0 0 0.0000 65106
9 11 0 0 0.0000 66627
10 21 0 0 0.0000 68165
11 90 0 0 0.0000 68700
12 31 0 0 0.0000 70221
我希望我的數據是這樣的:
# Type Response Acc RT Offset
1 70 0 0 0.0000 57850
2 31 0 0 0.0000 59371
3 41 0 0 0.0000 60909
4 70 0 0 0.0000 61478
5 31 0 0 0.0000 62999
6 41 0 0 0.0000 64537
8 70 0 0 0.0000 65106
9 11 0 0 0.0000 66627
10 21 0 0 0.0000 68165
11 90 0 0 0.0000 68700
12 5 0 0 0.0000 70221
我的代碼:
BEGIN {
priorline = "";
ERROROFFSET = 50;
ERRORVALUE[10] = 1;
ERRORVALUE[11] = 2;
ERRORVALUE[12] = 3;
ERRORVALUE[30] = 4;
ERRORVALUE[31] = 5;
ERRORVALUE[32] = 6;
ORS = "\n";
}
NR == 1 {
print;
getline;
priorline = $0;
}
NF == 6 {
brandnewline = $0
mytype = $2
$0 = priorline
priorField2 = $2;
if (mytype !~ priorField2) {
print;
priorline = brandnewline;
}
if (priorField2 == "90") {
mytype = ERRORVALUE[mytype];
}
}
END {print brandnewline}
##Here the parameters of the brandnewline is set to the current line and then the
##proirline is set to the line on which we just worked on and the brandnewline is
##set to be the next new line we are working on. (i.e line 1 = brandnewline, now
##we set priorline = brandnewline, thus priorline is line 1 and brandnewline takes
##on line 2) Next, the same parameters were set with column 2, mytype being the
##current column 2 value and priorField2 being the same value as mytype moves to
##the next column 2 value. Finally, we wrote an if statement where, if the value
##in column 2 of the current line !~ (does not equal) value of column two of the
##previous line, then the current line will be print otherwise it will just be
##skipped over. The second if statement recognizes the lines in which the value
##90 appeared and replaces the value in column 2 with a previously defined
##ERRORVALUE set for each specific type (type 10=1, 11=2,12=3, 30=4, 31=5, 32=6).
我已經能夠成功地刪除然而,重複行,我無法執行我的代碼的下一部分,即代替B中指定的值EGIN作爲ERRORVALUES(10 = 1,11 = 2,12 = 3,30 = 4,31 = 5,32 = 6)與包含該值的實際列。實質上,我想用我的ERRORVALUE替換該行中的值。
如果有人能幫助我,我會非常感激。
首先:非常感謝你的回答就已經非常有幫助。此外,謝謝你這樣快速的答覆。第二:我有一個擔心的是,如果可能的情況是,在我看到$ 2的90美元后,我可以用線替代之前的$ 2兩行中的什麼?在這個例子中,第11行的$ 2中有90個是可以將第9行中的$ 2更改爲BEGIN中描述的格式,如果是的話,我該如何去做這件事? – user1269741 2012-03-14 20:40:49
我可能需要2遍以上的文件:'awk'刪除重複的行'| tac | awk'如果之前的值2行是90'|,則替換$ 2 tac' - tac是從最後一行打印文件到第一行的方便工具。否則,awk腳本會變得有點混亂,因爲現在必須記住前兩行,注意2行之前沒有被刪除,等等。 – 2012-03-14 20:53:06