2012-10-19 25 views
0

嘿,我試圖找到文本文件中的記錄之間的距離。我試圖用awk來做到這一點。 一個例子是輸入:awk記錄之間的距離

1 2 1 4 yes 
2 3 2 2 no 
1 1 1 5 yes 
4 2 4 0 no 
5 1 0 1 no 

我想找到的每個的數值之間的距離。我通過減去這些數值然後平分答案來做到這一點。我已經嘗試了下面的代碼,但所有的距離都只是0.任何幫助將不勝感激。

BEGIN {recs = 0; fieldnum = 5;} 
{ 
    recs++; 
    for(i=1;i<=NF;i++) {data[recs,i] = $i;} 
} 
END { 
    for(r=1;r<=recs;r++) { 
    for(f=1;f<fieldnum;f++) { 
     ##find distances 
     for(t=1;t<=recs;t++) { 
     distance[r,t]+=((data[r,f] - data[t,f])*(data[r,f] - data[t,f])); 
      } 
     } 
    } 
     for(r=1;r<=recs;r++) { 
     for(t=1;t<recs;t++) { 
     ##print distances 
     printf("distance between %d and %d is %d \n",r,t,distance[r,t]); 
     } 
     } 
    } 
+3

請包括一些示例輸出和_define_距離。 – Steve

回答

3

不知道你的「每個數值之間的距離」是指在概念上,所以我不能幫你的算法,但讓我們清理代碼,看看是什麼樣子:

$ cat tst.awk 
{ 
    for(i=1;i<=NF;i++) { 
     data[NR,i] = $i 
    } 
} 
END { 
    for(r=1;r<=NR;r++) { 
    for(f=1;f<NF;f++) { 
     ##find distances 
     for(t=1;t<=NR;t++) { 
      delta = data[r,f] - data[t,f] 
      distance[r,t]+=(delta * delta) 
     } 
    } 
    } 
    for(r=1;r<=NR;r++) { 
    for(t=1;t<NR;t++) { 
     ##print distances 
     printf "distance between %d and %d is %d\n",r,t,distance[r,t] 
    } 
    } 
} 
$ 
$ awk -f tst.awk file 
distance between 1 and 1 is 0 
distance between 1 and 2 is 7 
distance between 1 and 3 is 2 
distance between 1 and 4 is 34 
distance between 2 and 1 is 7 
distance between 2 and 2 is 0 
distance between 2 and 3 is 15 
distance between 2 and 4 is 13 
distance between 3 and 1 is 2 
distance between 3 and 2 is 15 
distance between 3 and 3 is 0 
distance between 3 and 4 is 44 
distance between 4 and 1 is 34 
distance between 4 and 2 is 13 
distance between 4 and 3 is 44 
distance between 4 and 4 is 0 
distance between 5 and 1 is 27 
distance between 5 and 2 is 18 
distance between 5 and 3 is 33 
distance between 5 and 4 is 19 

似乎產生一些非零輸出....