0
我有一個很大的CSV文件(5Go)。標題是:用bash中的列條件刪除CSV文件中的行
run number,export,downerQ,coefUpQuality,chooseMode,demandF,nbPLots,standarDevPop,nbCitys,whatWord,priceMaxWineF,marketColor,[step],giniIndexReserve,giniIndexPatch,meanQualityTotal,meanQualityMountain,meanQualityPlain,DiffExtCentral,nbcentralPlots,meanPatchByNetwork,sum_q_viti_moutain,sum_q_viti_plaine
"3","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.07083333333333335","0","0","0","0","0","0","48","0"
"4","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.04285714285714286","0","0","0","0","0","0","42","0"
"2","false","0.5","0.01","false","7000","10","2","10","0","70","false","0","0","0.05348837209302328","0","0","0","0","0","0","43","0"
我想保留字段[步驟](第十三字段)中只包含「500」的行。
- 我試圖SQLite中導入該CSV ...但刪除崩潰...
- R還崩潰(甚至從data.table FREAD)
人是否有一個解決方案工具如sed
,awk
或其他命令?
查看[csvfix](https://code.google.com/p/csvfix/)。它當然可以做到。在shell中,第一步可能是'grep -E'^ run number |,「500」,''來選擇標題行和包含500的地方的行;然後你可以用'awk'將它縮小到第13列中的500。或者你可以在awk中完成整個工作:'awk -F,'NR == 1 || $ 13 ==「\」500 \「」{print}「'(未經測試,您可能需要將'OFS'設置爲'''',但可能不需要)。 – 2015-01-20 20:44:28