熊貓：根據第一個字符映射新值

有沒有辦法根據當前值的第一個字符將新值映射到數據幀列上。熊貓：根據第一個字符映射新值

我當前的代碼：

ncesvars['urbantype'] = np.where(ncesvars['urbantype'].str.startswith('1'), 'city', ncesvars['urbantype']) 
ncesvars['urbantype'] = np.where(ncesvars['urbantype'].str.startswith('2'), 'suburban', ncesvars['urbantype']) 
ncesvars['urbantype'] = np.where(ncesvars['urbantype'].str.startswith('3'), 'town', ncesvars['urbantype']) 
ncesvars['urbantype'] = np.where(ncesvars['urbantype'].str.startswith('4'), 'rural', ncesvars['urbantype'])

我想使用某種dict然後pd.replace，但不知道怎麼做，與.str.startswith()

來源

2016-05-03 As3adTintin

您可以定義類別的字典，使用str[0:1]切片數據，並通過測試數據的第一個字符是否在你的字典鍵調用map您Series的布爾面具，這樣只匹配將會否則你覆蓋用NaN覆蓋，因爲在下例中沒有最後一行的映射：

In [16]: 
df = pd.DataFrame({'urbantype':['1 asdas','2 asd','3 asds','4 asdssd','5 asdas']}) 
df 

Out[16]: 
    urbantype 
0 1 asdas 
1  2 asd 
2 3 asds 
3 4 asdssd 
4 5 asdas 

In [18]: 
d = {'1':'city','2':'suburban', '3': 'town','4':'rural'} 
df.loc[df['urbantype'].str[0:1].isin(d.keys()), 'urbantype'] = df['urbantype'].str[0:1].map(d) 
df 

Out[18]: 
    urbantype 
0  city 
1 suburban 
2  town 
3  rural 
4 5 asdas

來源

2016-05-03 18:03:33 EdChum

感謝您的輸入。與@ ayhan的答案相比，'df.loc'部分很重要嗎？ – As3adTintin

是的，因爲您只想影響數據與您的詞典鍵匹配的行，否則您用'NaN'覆蓋該行，這就是最後一行不變的原因 – EdChum

ahhh ok謝謝！ – As3adTintin

嘗試類似於：

ncesvars['urbantype'] = ncesvars['urbantype'].replace({ 
    r'^1.*', 'city', 
    r'^2.*', 'suburban'}, 
    regex=True)

測試：

In [32]: w 
Out[32]: 
    word 
0 1_A_ 
1 word03 
2 word02 
3 word00 
4 2xxx 
5 word04 
6 word01 
7 word02 
8 word04 
9 3aaa 

In [33]: w['word'].replace({r'^1.*': 'city', r'^2.*': 'suburban', r'^3.*': 'town'}, regex=True) 
Out[33]: 
0  city 
1  word03 
2  word02 
3  word00 
4 suburban 
5  word04 
6  word01 
7  word02 
8  word04 
9  town 
Name: word, dtype: object

來源

2016-05-03 17:57:35 MaxU

感謝您的輸入。我收到了erorr'replace（）得到了一個意想不到的關鍵字參數'regex'，當我嘗試沒有'regex'參數時，我收到錯誤'replace（）至少需要3個參數（給出2）' – As3adTintin

不起作用對於我來說，我收到原始值 – As3adTintin

@ As3adTintin，我已經添加了一個測試用例 – MaxU

熊貓：根據第一個字符映射新值

回答

相關問題