2013-08-28 100 views
0

我有這個csv文件,我想在第20和第21個字段排序。例如,這些字段中的數據是P1,PK5。我的挑戰是,當我在那些領域排序時,他們並不按照我所希望的順序排列。似乎我必須將這些字段填充到該字段數據中最長的值。用0填充csv值

OrderNum,MerrillRecipientID,CustomerClass,MerrillItemNum,PODTemplateID,GridCode,AetnaDocID,MemberID,FirstName,MI,LastName,Address1,Address2,Address3,City,State,Zip,Country,OEL,PalletNum,PckgNum,IMBCode,ProcDate 
"M394993","M39499300010000001","0GH","3GH000503","PDP","BO","1011250","MEBB04CB","Name","","Name","address","","","City","SC","29170-2043","","*******AUTO**SCH 5-DIGIT 29033","P1","PK5","2031100094470495539729170204309","3GH000503","August 26, 2013" 
"M394993","M39499300010000002","0GH","3GH000503","PDP","BO","1011572","MEBB07GB","Name","G","Name","address","","","City","SC","29020-2912","","*********AUTO**SCH 3-DIGIT 290","P1","PK1","3031100094470495580529020291210","3GH000503","August 26, 2013" 
"M394993","M39499300010000003","0GH","3GH000503","PDP","BO","1011693","MEBB08MP","Name","B","Name","address","","","City","SC","29061-9447","","*********AUTO**SCH 3-DIGIT 290","P1","PK2","3031100094470495583729061944757","3GH000503","August 26, 2013" 
"M394993","M39499300010000004","0GH","3GH000503","PDP","BO","1011751","MEBB097M","Name","A","Name","address","","","City","SC","29645-0433","","*************AUTO**3-DIGIT 296","P1","PK31","3031100094470495629629645043333","3GH000503","August 26, 2013" 
"M394993","M39499300010000005","0GH","3GH000503","PDP","BO","1012075","MEBB0K4L","Name","E","Name","address","","","City","SC","29682-9634","","*************AUTO**3-DIGIT 296","P1","PK33","3031100094470495637929682963428","3GH000503","August 26, 2013" 
"M394993","M39499300010000006","0GH","3GH000503","PDP","BO","1012437","MEBB0TWQ","Name","R","Name","address","","","City","SC","29505-3030","","*******AUTO**SCH 5-DIGIT 29501","P1","PK24","2031100094470495556429505303050","3GH000503","August 26, 2013" 
"M394993","M39499300010000007","0GH","3GH000503","PDP","BO","1012750","MEBB0YJY","Name","L","Name","address","","","City","SC","29642-3006","","***********AUTO**5-DIGIT 29642","P1","PK38","2031100094470495567529642300601","3GH000503","August 26, 2013" 

所以,從上面的數據,我需要有文件看起來像這樣:

OrderNum,MerrillRecipientID,CustomerClass,MerrillItemNum,PODTemplateID,GridCode,AetnaDocID,MemberID,FirstName,MI,LastName,Address1,Address2,Address3,City,State,Zip,Country,OEL,PalletNum,PckgNum,IMBCode,ProcDate 
"M394993","M39499300010000001","0GH","3GH000503","PDP","BO","1011250","MEBB04CB","Name","","Name","address","","","City","SC","29170-2043","","*******AUTO**SCH 5-DIGIT 29033","P1","PK05","2031100094470495539729170204309","3GH000503","August 26, 2013" 
"M394993","M39499300010000002","0GH","3GH000503","PDP","BO","1011572","MEBB07GB","Name","G","Name","address","","","City","SC","29020-2912","","*********AUTO**SCH 3-DIGIT 290","P1","PK01","3031100094470495580529020291210","3GH000503","August 26, 2013" 
"M394993","M39499300010000003","0GH","3GH000503","PDP","BO","1011693","MEBB08MP","Name","B","Name","address","","","City","SC","29061-9447","","*********AUTO**SCH 3-DIGIT 290","P1","PK02","3031100094470495583729061944757","3GH000503","August 26, 2013" 
"M394993","M39499300010000004","0GH","3GH000503","PDP","BO","1011751","MEBB097M","Name","A","Name","address","","","City","SC","29645-0433","","*************AUTO**3-DIGIT 296","P1","PK31","3031100094470495629629645043333","3GH000503","August 26, 2013" 
"M394993","M39499300010000005","0GH","3GH000503","PDP","BO","1012075","MEBB0K4L","Name","E","Name","address","","","City","SC","29682-9634","","*************AUTO**3-DIGIT 296","P1","PK33","3031100094470495637929682963428","3GH000503","August 26, 2013" 
"M394993","M39499300010000006","0GH","3GH000503","PDP","BO","1012437","MEBB0TWQ","Name","R","Name","address","","","City","SC","29505-3030","","*******AUTO**SCH 5-DIGIT 29501","P1","PK24","2031100094470495556429505303050","3GH000503","August 26, 2013" 
"M394993","M39499300010000007","0GH","3GH000503","PDP","BO","1012750","MEBB0YJY","Name","L","Name","address","","","City","SC","29642-3006","","***********AUTO**5-DIGIT 29642","P1","PK38","2031100094470495567529642300601","3GH000503","August 26, 2013" 

的P1領域可能是P100,所以我需要墊P1到P001。但實際上它只是需要無論最大長度。我可以對兩個字段上的文件進行排序,但不知道如何填充它們。

在此先感謝您的幫助。

+2

你在什麼環境?你想自己修改csv文件嗎?用python,perl或類似的東西?需要更多信息! – simon

+0

要回答你的問題,我們需要了解你使用什麼編程語言或工具來訪問.csv。知道數據庫類型(Oracle,MSSQL,mySql等)也有幫助。問:您正嘗試讀取現有的CSV(而不是寫入或修改.csv),對嗎? – paulsm4

+0

對不起,我在一個linux系統上。 SUSE。我確實想用shell腳本修改csv文件。我正在嘗試修改csv以將這兩個字段填充到這些字段中最長的值。 – GroveTuckey

回答

1

沒關係,因爲沒有別的已經即將到來,這裏有一個快速的Python(x或3 x)腳本,它會做你需要的東西:

import sys 
import csv 

reader = csv.reader(sys.stdin) 
writer = csv.writer(sys.stdout, quoting=csv.QUOTE_ALL) 

rows = [row for row in reader] 
max_len = max([len(row[20]) for row in rows[1:]]) 

writer.writerow(rows[0]) 
for row in rows[1:]: 
    while len(row[20]) < max_len: 
     row[20] = 'PK0' + row[20][2:] 
    writer.writerow(row) 

如果您保存此端口,比如,pad.py ,那麼你可以使用它像這樣:

$ cat /path/to/my_csv_file.csv | python /path/to/pad.py > /path/to/my_new_csv_file.csv 

,並會在你需要的格式創建my_new_csv_file.csv。由於腳本作用於stdin並輸出到stdout,因此您可以以多種不同的方式使用它以滿足您的目的。

希望這會有所幫助。