排序與一串數字

我有一條線，是這樣的一行：排序與一串數字

string 2 2 3 3 1 4

在第2，第4和第6列代表一個ID（假設每個ID號是唯一的）和第3，第5和第7列表示與相應ID相關聯的一些數據。

我該如何重新排列這條線，以便它通過ID排序？

string 1 4 2 2 3 3

注意：與示例不同，一行可以有任意數量的ID。

使用shell腳本，我想這樣

while read n  
do 
    echo $(echo $n | sork -k (... stuck here)) 
done < infile

來源

2017-02-18 namesake22

首先，你不能排序一行。嘗試循環訪問數據並將其放入基於奇數的數組中（假設您是正確的並且它們是唯一的） – grail

作爲一個bash腳本，這可以用做：

代碼：

#!/usr/bin/env bash 

# send field pairs as separate lines 
function emit_line() { 
    while [ $# -gt 0 ] ; do 
     echo "$1" "$2" 
     shift; shift 
    done 
} 

# break the line into pieces and send to sort 
function sort_line() { 
    echo $1 
    shift 
    emit_line $* | sort 
} 

# loop through the lines in the file and sort by key-value pairs 
while read n; do 
    echo $(sort_line $n) 
done < infile

文件infile：

string 2 2 3 3 1 4 
string 2 2 0 3 4 4 1 7 
string 2 2 0 3 2 1

輸出：

string 1 4 2 2 3 3 
string 0 3 1 7 2 2 4 4 
string 0 3 2 1 2 2

更新：

從grail's version惡癖排序，刪除（慢得多）的外部排序：

function sort_line() { 
    line="$1" 
    shift 

    while [ $# -gt 0 ] ; do 
     data[$1]=$2 
     shift; shift 
    done 

    for i in ${!data[@]}; do 
     out="$line $i ${data[i]}" 
    done 
    unset data 
    echo $line 
} 

while read n; do 
    sort_line $n 
done < infile

來源

2017-02-18 17:08:39

您可以使用Python這一點。該功能將該列拆分爲tuples的list，然後可以對其進行排序。然後使用itertools.chain重新組合鍵值對。

代碼：

import itertools as it 

def sort_line(line): 
    # split the line on white space 
    x = line.split() 

    # make a tuple of key value pairs 
    as_tuples = [tuple(x[i:i+2]) for i in range(1, len(x), 2)] 

    # sort the tuples, and flatten them with chain 
    sorted_kv = list(it.chain(*sorted(as_tuples))) 

    # join the results back into a string 
    return ' '.join([x[0]] + sorted_kv)

測試代碼：

data = [ 
    "string 2 2 3 3 1 4", 
    "string 2 2 0 3 4 4 1 7", 
] 

for line in data: 
    print(sort_line(line))

結果：

string 1 4 2 2 3 3 
string 0 3 1 7 2 2 4 4

來源

2017-02-18 16:42:17

另一個慶典替代它不依賴於有多少IDS有：

#!/usr/bin/env bash 

x='string 2 2 3 3 1 4' 
out="${x%% *}" 

in=($x) 

for ((i = 1; i < ${#in[*]}; i += 2)) 
do 
    new[${in[i]}]=${in[i+1]} 
done 

for i in ${!new[@]} 
do 
    out="$out $i ${new[i]}" 
done 

echo $out

如果你想讀一個文件，你可以放一個循環周圍

來源

2017-02-18 17:50:48 grail

我會添加一個gawk解決方案重刑你的長長的選項列表。

這是一個獨立的腳本：

#!/usr/bin/env gawk -f 

{ 
    line=$1 

    # Collect the tuples into values of an array, 
    for (i=2;i<NF;i+=2) a[i]=$i FS $(i+1) 

    # This sorts the array "a" by value, numerically, ascending... 
    asort(a, a, "@val_num_asc") 

    # And this for loop gathers the result. 
    for (i=0; i<length(a); i++) line=line FS a[i] 

    # Finally, print the line, 
    print line 

    # and clear the array for the next round. 
    delete a 
}

這是通過複製你的元組到一個數組，排序數組，然後重裝排序元組中一個for循環，打印數組元素。

請注意，由於使用了asort()，所以它只是gawk（不是傳統的awk）。

$ cat infile 
string 2 2 3 3 1 4 
other 5 1 20 9 3 7 
$ ./sorttuples infile 
string 1 4 2 2 3 3 
other 3 7 5 1 20 9

來源

2017-02-18 20:12:51 ghoti

這似乎是一個好主意。將研究如何gawk工作 – namesake22

排序與一串數字

回答

相關問題