我從MongoDB中讀取數據並將其存儲在大熊貓數據幀進行進一步的探索性分析和機器學習 我的MongoDB的文件看起來是這樣的..如何存儲MongoDB的在大熊貓嵌套的文檔,而無需重複
{
"user_id" : "user_9",
"order_id" : "order_9",
"meals" : 5,
"order_area" : "London",
"dish" : [
{
"dish_id" : "012" ,
"dish_name" : "ABC",
"dish_type" : "Non-Veg",
"dish_price" : 135,
"dish_quantity" : 2,
"ratings" : 4,
"reviews" : "blah blah blah",
"coupon_type" : "Rs 20 off"
},
{
"dish_id" : "013" ,
"dish_name" : "XYZ",
"dish_type" : "Non-Veg",
"dish_price" : 125,
"dish_quantity" : 3,
"ratings" : 4,
"reviews" : "blah blah blah",
"coupon_type" : "Rs 20 off"
},
],
}
一旦我得到了蟒蛇的數據我用json_normalize而將其插入到一個數據幀
df= json_normalize(db.dataset2.find(), 'dish',
['_id','user_id','order_id','order_time','meals','order_area']
這使我在熊貓以下分裂菜相關的屬性
coupon_type dish_id dish_name dish_price dish_quantity
0 Rs 20 off 012 ABC 135 2
1 Rs 20 off 013 XYZ 125 3
ratings reviews coupon_type user_id order_id meals order_area
0 4 blah blah blah Rs 20 off 9 9 5 London
1 4 blah blah blah Rs 20 off 9 9 5 London
問題的,這是數據被複制(USER_ID,ORDER_ID,膳食,_id & order_area) 請告訴我其他方法來在數據幀中存儲該數據,而無需重複?
我沒有聽到'pandas'庫之前,所以這個問題的標題很有意思對我:) –