2017-06-07 64 views
-1

我有3個具有ID,名稱和值的列文件。將列轉換爲水平PIVOT

1,Brand,sports 
1,Color,White 
1,Gender,Male 
1,Logo,yes 
1,width,10 
4,Brand,Running 
4,width,12 
4,Fits,Lose 
3,catgegory,shoe 
3,Color,blue 
3,Color,white 
3,primarycolor,blue 
5,size,M 
5,Brand,Sports 
5,Brand,Running 

我想這個轉換成基於列1和列2

它類似於文本值數據透視表打印出來,它不能用Excel做了水平格式透視它將只允許總和/數值。

這可以在UNIX中生成嗎?

,Brand,Color,Gender,Logo,width,Fits,catgegory,primarycolor,size 
1,sports,White,Male,yes,10,,,, 
4,Running,,,,12,Lose,,, 
3,,blue/white,,,,,shoe,blue, 
5,Sports/Running,,,,,,,,M 

回答

0

輸入

[email protected]:/tmp$ cat data.txt 
1,Brand,sports 
1,Color,White 
1,Gender,Male 
1,Logo,yes 
1,width,10 
4,Brand,Running 
4,width,12 
4,Fits,Lose 
3,catgegory,shoe 
3,Color,blue 
3,primarycolor,blue 
5,size,M 
5,Brand,Running 

腳本

[email protected]:/tmp$ cat pivot.awk 
{ 
    id=$1; name=$2; value=$3 
    ids[id]; 
    # this is to retain order 
    if(!(name in tmp)){ tmp[name]; names[++c]=name; } 
    values[id,name] = value 
} 
END { 
    # comment below line if you hide "id" 
    printf "id" 

    for (name in names) { 
     printf "%s%s",OFS,names[name] 
    } 
    print "" 
    for (id in ids) { 
     printf "%s",id 
     for (name in names) { 
      printf "%s%s",OFS,values[id,names[name]] 
     } print "" 
    } 
} 

執行和輸出

[email protected]:/tmp$ awk -v FS=, -v OFS=, -f pivot.awk data.txt 
id,Brand,Color,Gender,Logo,width,Fits,catgegory,primarycolor,size 
1,sports,White,Male,yes,10,,,, 
3,,blue,,,,,shoe,blue, 
4,Running,,,,12,Lose,,, 
5,Running,,,,,,,,M 

這產生O/P相同預期O/P包括訂單

[email protected]:/tmp$ cat pivot_with_order.awk 
{ 
    id=$1; name=$2; value=$3 

    # this is to retain order 
    if(!(id in itmp)){ itmp[id]; ids[++i]=id; } 
    if(!(name in tmp)){ tmp[name]; names[++c]=name; } 

    values[id,name] = value 
} 
END { 
    # uncomment below line if you want to display "id" 
    # printf "id" 

    for (name in names) { 
     printf "%s%s",OFS,names[name] 
    } 
    print "" 
    for (id in ids) { 
     printf "%s",ids[id] 
     for (name in names) { 
      printf "%s%s",OFS,values[ids[id],names[name]] 
     } print "" 
    } 
} 

輸出

[email protected]:/tmp$ awk -v FS=, -v OFS=, -f pivot_with_order.awk data.txt 
,Brand,Color,Gender,Logo,width,Fits,catgegory,primarycolor,size 
1,sports,White,Male,yes,10,,,, 
4,Running,,,,12,Lose,,, 
3,,blue,,,,,shoe,blue, 
5,Running,,,,,,,,M 
+0

非常感謝阿克沙伊,我會很快檢查這一項。如果我有不同的充值,下同filed.like –

+0

想,我怎樣才能改變腳本@akshay赫格德 5,品牌,運行 5,品牌,Running1 輸出: 5,跑步/ Running1 ,,, ,,,,, M –

+0

對不起,我已經修改了問題以包含重複的方案。 –

0

Excel的VBA:

Sub Pivot() 

    Dim rngData, rngOut, r, c, dR, dC 
    Set dR = CreateObject("scripting.dictionary") 
    Set dC = CreateObject("scripting.dictionary") 

    Set rngData = ActiveSheet.Range("A2:C2") '<< first row of input 
    Set rngOut = ActiveSheet.Range("G2")  '<< where to put output 

    Do While Application.CountA(rngData) > 0 
     r = rngData(1) 
     c = rngData(2) 
     If Not dR.exists(r) Then 
      dR.Add r, dR.Count + 1 
      rngOut.Offset(dR.Count, 0) = r 
     End If 
     If Not dC.exists(c) Then 
      dC.Add c, dC.Count + 1 
      rngOut.Offset(0, dC.Count) = c 
     End If 

     With rngOut.Offset(dR(r), dC(c)) 
      'if already has a value, add a newline separator 
      .Value = .Value & IIf(.Value <> "", vbLf, "") & rngData(3) 
     End With 
     Set rngData = rngData.Offset(1, 0) 
    Loop 

End Sub 
0

的Python 2.7

假設list.csv是:

1,Brand,sports 
1,Color,White 
1,Gender,Male 
1,Logo,yes 
1,width,10 
4,Brand,Running 
4,width,12 
4,Fits,Lose 
3,catgegory,shoe 
3,Color,blue 
3,Color,white 
3,primarycolor,blue 
5,size,M 
5,Brand,Sports 
5,Brand,Running 

Python代碼在un.py

# up.py 
# 
import csv 
rowid={}; colid={}; tab={}; rowTitle=[]; colTitle=[] 
with open('list.csv', 'rb') as csvfile: 
    v = csv.reader(csvfile) 
    for row in v: 
    rowid[row[0]]=1; colid[row[1]]=1; k=row[0]+"@"+row[1] 
    if tab.has_key(k): 
     tab[k]=tab[k]+"/"+row[2]   
    else: 
     tab[k]=row[2] 

rowTitle=rowid.keys(); colTitle=colid.keys() 
rowTitle.sort(); colTitle.sort(); 

s="" 
for j in colTitle: s=s+","+j 
print s 
for i in rowTitle: 
    s=i 
    for j in colTitle: 
    k=i+"@"+j 
    s=s+"," 
    if tab.has_key(k): 
     s=s+tab[k] 
    print s 

py un.py輸出爲:

,Brand,Color,Fits,Gender,Logo,catgegory,primarycolor,size,width 
1,sports,White,,Male,yes,,,,10 
3,,blue/white,,,,shoe,blue,, 
4,Running,,Lose,,,,,,12 
5,Sports/Running,,,,,,,M,