2016-09-28 45 views
1

我使用下面的代碼轉換我RDD數據幀:pyspark錯誤創建從RDD DF時:類型錯誤:無法爲類型推斷的架構:<類型「浮動」>

time_df = time_rdd.toDF(['my_time']) 

,並獲得以下錯誤:

TypeErrorTraceback (most recent call last) 
<ipython-input-40-ab9e3025f679> in <module>() 
----> 1 time_df = time_rdd.toDF(['my_time']) 

/usr/local/spark-latest/python/pyspark/sql/session.py in toDF(self, schema, sampleRatio) 
    55   [Row(name=u'Alice', age=1)] 
    56   """ 
---> 57   return sparkSession.createDataFrame(self, schema, sampleRatio) 
    58 
    59  RDD.toDF = toDF 

/usr/local/spark-latest/python/pyspark/sql/session.py in createDataFrame(self, data, schema, samplingRatio) 
    518 
    519   if isinstance(data, RDD): 
--> 520    rdd, schema = self._createFromRDD(data.map(prepare), schema, samplingRatio) 
    521   else: 
    522    rdd, schema = self._createFromLocal(map(prepare, data), schema) 

/usr/local/spark-latest/python/pyspark/sql/session.py in _createFromRDD(self, rdd, schema, samplingRatio) 
    358   """ 
    359   if schema is None or isinstance(schema, (list, tuple)): 
--> 360    struct = self._inferSchema(rdd, samplingRatio) 
    361    converter = _create_converter(struct) 
    362    rdd = rdd.map(converter) 

/usr/local/spark-latest/python/pyspark/sql/session.py in _inferSchema(self, rdd, samplingRatio) 
    338 
    339   if samplingRatio is None: 
--> 340    schema = _infer_schema(first) 
    341    if _has_nulltype(schema): 
    342     for row in rdd.take(100)[1:]: 

/usr/local/spark-latest/python/pyspark/sql/types.py in _infer_schema(row) 
    987 
    988  else: 
--> 989   raise TypeError("Can not infer schema for type: %s" % type(row)) 
    990 
    991  fields = [StructField(k, _infer_type(v), True) for k, v in items] 

TypeError: Can not infer schema for type: <type 'float'> 

有誰知道我錯過了什麼?謝謝!

回答

0

檢查您的time_rdd是否爲RDD。

什麼u得到具有:

>>>type(time_rdd) 

>>>dir(time_rdd) 
相關問題