0
修改熊貓代碼我有下面的代碼片斷,其用於創建的曲線圖。我想修改它在PySpark中工作,但不知道如何繼續。問題是我無法迭代PySpark中的列,並且我沒有成功嘗試將它變成函數。爲PySpark數據幀
背景:據幀有一個名爲City
列這是城市的只是名字作爲一個字符串
cities = [i.City for i in df.select('City').distinct().collect()]
stack = []
for city in cities:
df = sqlContext.sql( 'SELECT Complaint Type, COUNT(*) as `counts` '
'FROM c311 '
'WHERE City = "{}" COLLATE NOCASE '
'GROUP BY `Complaint Type` '
'ORDER BY counts DESC'.format(city))
stack.append(Bar(x=df['Complaint Type'], y=df.counts, name=city.capitalize()))
我的目標是再發送此toPandas()
並在本地繪製它。不過,我自Column is not iterable
以來遇到錯誤。我如何解決PySpark的問題?