2016-05-04 105 views
0

我想要返回一個數據框中的列作爲郵政編碼。此代碼有效,但不會在數據框gps中創建新列。python:使用多處理與數據框進行地理編碼

import geocoder 
import multiprocessing as mp 
import pandas as pd 

google_key = 'key' 

def reverse_gecode(coordinates): 
    return geocoder.google(coordinates, key = google_key, method = 'reverse').postal 

if __name__ == '__main__':    
    gps = pd.DataFrame({'lat': [27.950575, 40.6936488], 
         'lon': [-82.4571776, -89.5889864]}) # dataframe mehtod 
    gps['gps'] = zip(gps.lat, gps.lon) 
    x = list(gps['gps']) 
    # multiprocessings      
    pool = mp.Pool(processes = (mp.cpu_count() - 1)) 
    result_latlong = pool.map(reverse_gecode, x) 
    pool.close() 
    pool.join() 

我已經試過

  1. gps['zip_code'] = gps.apply(lambda x: pool.map(reverse_gecode, list(x[2])), axis = 1)
  2. gps['zip_code'] = gps.apply(lambda x: pool.map(reverse_gecode, x[2]), axis = 1)
  3. gps['zip_code'] = gps.apply(lambda x: pool.map(reverse_gecode, [x[0], x[1]]), axis = 1)

但我不能得到任何工作。我不斷收到錯誤是

ValueError: ('Unknown location: 27.950575', u'occurred at index 0')

回答

1

嘗試是:

import geocoder 
import multiprocessing as mp 
import pandas as pd 

def reverse_gecode(coordinates): 
    return geocoder.google(coordinates, method = 'reverse').postal 

if __name__ == '__main__':    
    gps = pd.DataFrame({'lat': [27.950575, 40.6936488], 
         'lon': [-82.4571776, -89.5889864]}) # dataframe mehtod 
    coords = gps[['lat','lon']].astype(str).apply(lambda x: (x[0],x[1]), axis=1).tolist() 
    # multiprocessings      
    pool = mp.Pool(processes = (mp.cpu_count() - 1)) 
    gps['zip_code'] = pool.map(reverse_gecode, coords) 
    print(gps) 
    pool.close() 
    pool.join() 

PS我已經刪除的geocoder.google()通話key=google_key,因爲它沒有爲我工作

輸出:

  lat  lon zip_code 
0 27.950575 -82.457178 33602 
1 40.693649 -89.588986 61603 
+0

@dustin,我已經更新了我的答案 - 請檢查 – MaxU

+0

@dustin,你有沒有嘗試ru ñ我的代碼'原樣'? – MaxU

+0

我剛剛檢查過它 - 適用於Python 3.5.1和2.7.11(pandas:0.18.0) - 你有什麼版本? – MaxU