0
joindf.printSchema()
root
|-- order_customer_id: string (nullable = true)
|-- order_date: string (nullable = true)
|-- order_id: string (nullable = true)
|-- order_status: string (nullable = true)
|-- order_item_id: string (nullable = true)
|-- order_item_order_id: string (nullable = true)
|-- order_item_product_id: string (nullable = true)
|-- order_item_product_price: string (nullable = true)
|-- order_item_quantity: string (nullable = true)
|-- order_item_subtotal: string (nullable = true)
joindf.show(5)
+-----------------+--------------------+--------+------------+-------------+-------------------+---------------------+------------------------+-------------------+-------------------+
|order_customer_id| order_date|order_id|order_status|order_item_id|order_item_order_id|order_item_product_id|order_item_product_price|order_item_quantity|order_item_subtotal|
+-----------------+--------------------+--------+------------+-------------+-------------------+---------------------+------------------------+-------------------+-------------------+
| 10153|2013-08-17 00:00:...| 4061| COMPLETE| 10153| 4080| 365| 59.99| 4| 239.96|
| 10153|2014-01-12 00:00:...| 27596| PENDING| 10153| 4080| 365| 59.99| 4| 239.96|
| 10153|2014-07-18 00:00:...| 56604| CLOSED| 10153| 4080| 365| 59.99| 4| 239.96|
| 10153|2013-08-14 00:00:...| 58259| COMPLETE| 10153| 4080| 365| 59.99| 4| 239.96|
| 10153|2013-08-14 00:00:...| 58269| PENDING| 10153| 4080| 365| 59.99| 4| 239.96|
+-----------------+--------------------+--------+------------+-------------+-------------------+---------------------+------------------------+-------------------+-------------------+
我在此RDD上使用combineByKey()生成結果,該結果給出了每天每個狀態的總訂單和總金額。 下面是代碼:使用combineByKey時的錯誤
joindf.map(lambda x: ((str(x[1]),str(x[3])),(float(x[9]),int(x[2]))))
.combineByKey(lambda v: (v[0],set(v[1])) ,
lambda acc,v: (acc[0]+v[0],v[1].add(acc[1])),
lambda acc1,acc2 : (acc1[0]+acc2[0],acc1[1].update(acc2[1])))
這是給錯誤。
TypeError: 'int' object is not iterable
我哪裏錯了?請幫助。
這是否幫助了? –