我試圖篩選基於類似下面的RDD:pyspark:類型錯誤:條件應該是字符串或列
spark_df = sc.createDataFrame(pandas_df)
spark_df.filter(lambda r: str(r['target']).startswith('good'))
spark_df.take(5)
但得到了以下錯誤:
TypeErrorTraceback (most recent call last)
<ipython-input-8-86cfb363dd8b> in <module>()
1 spark_df = sc.createDataFrame(pandas_df)
----> 2 spark_df.filter(lambda r: str(r['target']).startswith('good'))
3 spark_df.take(5)
/usr/local/spark-latest/python/pyspark/sql/dataframe.py in filter(self, condition)
904 jdf = self._jdf.filter(condition._jc)
905 else:
--> 906 raise TypeError("condition should be string or Column")
907 return DataFrame(jdf, self.sql_ctx)
908
TypeError: condition should be string or Column
任何想法,我錯過了什麼?謝謝!
有一個完美的答案正下方位置; ) – javadba