2017-09-29 22 views
-1
fake = {'EmployeeID' : [0,1,2,3,4,5,6,7,8,9], 
      'State' : ['a','b','c','d','e','f','g','h','i','j'], 
      'Email' : ['a','b','c','d','e','f','g','h','i','j'] 
       } 
fake_df = pd.DataFrame(fake) 

我想定義一個函數,返回一系列字符串的員工的所有電子郵件地址的狀態。電子郵件地址應該用給定的分隔符分隔。我想我會用「;」。如何從數據框中獲取系列?

參數: - 據幀 - 分隔符(;)

我必須使用循環?說實話,我甚至不知道如何開始對這個..

====版

與編碼完成後,我應該運行

emails = getEmailListByState(fake_df, ", ") 
for state in sorted(emails.index): 
    print "%15s: %s" % (state, emails[state]) 

,應該得到這樣的事情

a: a 
b: b 
c: c,d 
d: e 
e: f,g 

我的輸出

+0

有查看DataFrame [索引](https://pandas.pydata.org/pandas-docs/stable/indexing.html)和[加入](https://docs.python.org/2/library/stdtypes.html# str.join)字符串迭代器 – bunji

+0

BTw,你期望輸出什麼 – Wen

+0

我編輯我的帖子 –

回答

1

如果我正確地理解這個問題你正在尋找GROU PBY狀態,得到了電子郵件和申請加入即加入基於狀態的電子郵件即

fake = {'EmployeeID' : [0,1,2,3,4,5,6,7,8,9], 
     'State' : ['NZ','NZ','NY','NY','ST','ST','YK','YK','YK','YK'], 
     'Email' : ['[email protected]','[email protected]','[email protected]','[email protected]','[email protected]','[email protected]','[email protected]','[email protected]','[email protected]','[email protected]'] 
      } 
fake_df = pd.DataFrame(fake) 

ndf = fake_df.groupby('State')['Email'].apply(', '.join) 

輸出:

 
State 
NY       [email protected], [email protected] 
NZ       [email protected], [email protected] 
ST       [email protected], [email protected] 
YK [email protected], [email protected], [email protected], [email protected] 
Name: Email, dtype: object 

如果你想在一個方法則

def getEmailListByState(df,delim): 
    return df.groupby('State')['Email'].apply(delim.join) 

emails = getEmailListByState(fake_df, ", ") 
for state in sorted(emails.index): 
    print("%15s: %s" % (state, emails[state])