如何刪除列中數據幀

df2000.drop('jan','feb','mar','apr','may','jun','jul','aug','sep','oct','nov','dec').show()

enter image description here 如何刪除列中數據幀

現在它顯示了不刪列在數據幀

enter image description here

df2000.show()

當我運行show命令單獨檢查表。但自帶刪除列。

來源

2017-07-22 NandaKrishnan

drop不是一個副作用函數。它會返回一個刪除了指定列的新數據框。所以您應該將新的數據幀分配給稍後參考的值，如下所示。

>>> df2000 = spark.createDataFrame([('a',10,20,30),('a',10,20,30),('a',10,20,30),('a',10,20,30)],['key', 'jan', 'feb', 'mar']) 
>>> cols = ['jan', 'feb', 'mar'] 
>>> df2000.show() 
+---+---+---+---+ 
|key|jan|feb|mar| 
+---+---+---+---+ 
| a| 10| 20| 30| 
| a| 10| 20| 30| 
| a| 10| 20| 30| 
| a| 10| 20| 30| 
+---+---+---+---+ 

>>> cols = ['jan', 'feb', 'mar'] 
>>> df2000_dropped_col = reduce(lambda x,y: x.drop(y),cols,df2000) 
>>> df2000_dropped_col.show() 
+---+ 
|key| 
+---+ 
| a| 
| a| 
| a| 
| a| 
+---+

現在在新的數據框做show將產生期望的結果與所有月份下降列。

來源

2017-07-23 02:02:38

這也是不工作在pyspark – NandaKrishnan

你能試試嗎？ 'df2000_dropped_cols = df2000.drop（'jan'，'feb'，'mar'）'？ –

是的。我試過這個。它不工作。 Df2000_dropped_cols正在分配.. – NandaKrishnan

如何刪除列中數據幀

回答

相關問題