2017-07-10 71 views
2

嗨大家我有以下數據。忽略空格後特定列號

61684 376 23 106 38695633 1 0 0 -1 /C/Program Files (x86)/ 16704 root;[email protected]:SERVICE root;[email protected]:SERVICE 0 1407331175 1407331175 1247541608 
8634 416 13 86 574126 1 0 0 -1 /E/KYCImages/ 16832 root;[email protected] root;[email protected] 0 1406018846 1406018846 1352415392 
60971 472 22 86 38613076 1 0 0 -1 /E/KYCwebsvc binaries/ 16832 root;[email protected] root;[email protected] 0 1390829495 1390829495 1353370744 
1 416 10 86 1 1 0 0 -1 /E/KycApp/ 16832 root;[email protected] root;[email protected] 0 1411465772 1411465772 1351291187 

現在,我使用下面的代碼:

awk 'BEGIN{FPAT = "([^ ]+)|(\"[^\"]+\")"}{print $10}' | awk '$1!~/^\/\./' | sort -u | sed -e 's/\,//g' | perl -p00e 's/\n(?!\Z)/;/g' filename 

我得到這個輸出

/C/Program;/E/KycApp/;/E/KYCImages/;/E/KycServices/;/E/KYCwebsvc 

不過,我需要開始從$ 10的輸出,直到 「/」 再次遇到,基本上我想忽略列10中的任何空格,直到遇到「/」。 這可能嗎?

所需的輸出是

/C/Program Files (x86)/;/E/KycApp/;/E/KYCImages/;/E/KycServices/;/E/KYCwebsvc binaries/ 
+0

如果你有'-o'選項的'grep',看起來像這就是你想要的......'grep -o'/[^.].*/'filename | sort -u |粘貼-sd';''...你的示例數據應該包含行以顯示爲什麼你需要'awk'$ 1!〜/^\/\./''或'sed -e's/\,// g' ' – Sundeep

+0

paste -sd';'在AIX中不起作用,所以我用perl -p00e的/ \ n(?!\ Z)/;/g'我需要awk'$ 1!〜/^\/\./'來忽略任何具有「/ 「開頭後面跟着」。「另外grep -o在AIX中不起作用。所以需要別的東西。 – Sid

+0

單命令'perl -lne'($ p)= /(\/[^。]。*//)/; $ H {$ P} = 1; END {print join「;」,keys%h}'filename' then? – Sundeep

回答

1

採用單呆子

awk 'BEGIN{ FPAT="/[^/]+/[^/]+/"; PROCINFO["sorted_in"]="@ind_str_asc"; IGNORECASE = 1 } 
    { a[$1] }END{ for(i in a) r=(r!="")? r";"i : i; print r }' filename 

輸出(無/E/KycServices/; - 因爲那不是你的輸入中):

/C/Program Files (x86)/;/E/KycApp/;/E/KYCImages/;/E/KYCwebsvc binaries/ 
+0

非常好的使用PROCINFO [「sorted_in」]'。 – CWLiu

0

嘗試在單個awk中也是如此。

awk '{match($0,/\/.*\//);VAL=VAL?VAL ORS substr($0,RSTART,RLENGTH):substr($0,RSTART,RLENGTH)} END{num=split(VAL, array,"\n");for(i=1;i<=num;i++){printf("%s%s",array[i],i==num?"":";")};print""}' Input_file 

過短時間內會添加非單線形式的解答。

EDIT1:現在也成功地添加了非線性形式的解決方案。

awk '{ 
     match($0,/\/.*\//); 
     VAL=VAL?VAL ORS substr($0,RSTART,RLENGTH):substr($0,RSTART,RLENGTH) 
    } 
     END{ 
       num=split(VAL, array,"\n"); 
       for(i=1;i<=num;i++){ 
             printf("%s%s",array[i],i==num?"":";") 
            }; 
       print"" 
      } 
    ' Input_file 

EDIT2:添加的代碼解釋在解決非一個襯片的形式現在也。

awk '{ 
     match($0,/\/.*\//); ##Using match functionality of awk which will match regex to find the string in a line from/to \, note I am escaping them here too. 
     VAL=VAL?VAL ORS substr($0,RSTART,RLENGTH):substr($0,RSTART,RLENGTH) ##creating a variable named VAL here which will concatenate its own value if more than one occurrence are there. Also RSTART and RSTART are the variables of built-in awk which will be having values once a match has TRUE value which it confirms once a regex match is found in a line. 
    } 
     END{ ##Starting this block here. 
       num=split(VAL, array,"\n");##creating an variable num whose value will be number of elements in array named array, split is a built-in keyword of awk which will create an array with a defined delimiter, here it is new line. 
       for(i=1;i<=num;i++){ ##Starting a for loop here whose value will go till num value from i variable value 1 to till num. 
             printf("%s%s",array[i],i==num?"":";") ##printing the array value whose index is variable i and second string it is printing is semi colon, there a condition is there if i value is equal to num then print null else print a semi colon. 
            }; 
       print"" ##print NULL value to print a new line. 
      } 
    ' Input_file ###Mentioning the Input_file here.