迭代通過多個dataframes大熊貓

我有兩個dataframes： 1）包含供應商的名單和他們的緯度，經度座標迭代通過多個dataframes大熊貓

sup_essential = pd.DataFrame({'supplier': ['A','B','C'], 
           'coords': [(51.1235,-0.3453),(52.1245,-0.3423),(53.1235,-1.4553)]})

2）存儲列表和它們的緯度，經度座標

stores_essential = pd.DataFrame({'storekey': [1,2,3], 
           'coords': [(54.1235,-0.6553),(49.1245,-1.3423),(50.1235,-1.8553)]})

我想創建一個輸出表，其中包含store，store_coordinates，supplier，supplier_coordinates，每個store和supplier的組合距離。

我目前有：

test=[] 
for row in sup_essential.iterrows(): 
    for row in stores_essential.iterrows(): 
     r = sup_essential['supplier'],stores_essential['storeKey'] 
     test.append(r)

但這只是給了我所有重複值的

來源

2017-04-16 PaddyD15

請提供小（3-7行）在文本/ CSV格式再現的數據集和所希望的數據集。請閱讀[如何使良好的可重複熊貓示例]（http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples） – MaxU

@MaxU數據本身是保密的，並給出它是座標這將很容易識別。然而，標題都是：對於專賣店： storeKey（INT）\t locationLongitude \t locationLatitude \t COORDS（緯度，經度）對於供應商：供應商（VARCHAR）\t緯度\t經度\t COORDS（緯度，經度） – PaddyD15

您不需要指定真實數據。只需[post]（http://stackoverflow.com/posts/43435657/edit）示例（假）數據集在您的問題 – MaxU

來源的DF

In [105]: sup 
Out[105]: 
       coords supplier 
0 (51.1235, -0.3453)  A 
1 (52.1245, -0.3423)  B 
2 (53.1235, -1.4553)  C 

In [106]: stores 
Out[106]: 
       coords storekey 
0 (54.1235, -0.6553)   1 
1 (49.1245, -1.3423)   2 
2 (50.1235, -1.8553)   3

解決方案：

from sklearn.neighbors import DistanceMetric 
dist = DistanceMetric.get_metric('haversine') 

m = pd.merge(sup.assign(x=0), stores.assign(x=0), on='x', suffixes=['1','2']).drop('x',1) 

d1 = sup[['coords']].assign(lat=sup.coords.str[0], lon=sup.coords.str[1]).drop('coords',1) 
d2 = stores[['coords']].assign(lat=stores.coords.str[0], lon=stores.coords.str[1]).drop('coords',1) 

m['dist_km'] = np.ravel(dist.pairwise(np.radians(d1), np.radians(d2)) * 6367) 
## -- End pasted text --

結果：

In [135]: m 
Out[135]: 
       coords1 supplier    coords2 storekey  dist_km 
0 (51.1235, -0.3453)  A (54.1235, -0.6553)   1 334.029670 
1 (51.1235, -0.3453)  A (49.1245, -1.3423)   2 233.213416 
2 (51.1235, -0.3453)  A (50.1235, -1.8553)   3 153.880680 
3 (52.1245, -0.3423)  B (54.1235, -0.6553)   1 223.116901 
4 (52.1245, -0.3423)  B (49.1245, -1.3423)   2 340.738587 
5 (52.1245, -0.3423)  B (50.1235, -1.8553)   3 246.116984 
6 (53.1235, -1.4553)  C (54.1235, -0.6553)   1 122.997130 
7 (53.1235, -1.4553)  C (49.1245, -1.3423)   2 444.459052 
8 (53.1235, -1.4553)  C (50.1235, -1.8553)   3 334.514028

來源

2017-04-16 10:17:22 MaxU

迭代通過多個dataframes大熊貓

回答

相關問題