2017-11-10 106 views
1

我有一個字符串的數據幀列。現在我想用來自另一個具有要替換的單詞的含義的數據框的值替換這些字符串中的特定單詞。我目前正在使用iterrrows(),這需要大約2分鐘25000行。我想知道是否有更有效的方式來做到這一點。使用字典替換數據幀列中的值

syn = pd.ExcelFile("C:/Key-Value.xlsx") 
df_syn = syn.parse("Keys") 

for idx, row in df_syn.iterrows(): 
    df['col'] = df['col'].str.replace(r"\b"+row['synonym']+r"\b", row['word']) 

回答

1
IIUC

設置

df_syn = pd.DataFrame(dict(synonym=['hug', 'kiss'], word=['warm', 'tender'])) 
df = pd.DataFrame(dict(col=['I want a hug', 'a kiss would be great'])) 

print(df_syn, df, sep='\n\n') 

    synonym word 
0  hug warm 
1 kiss tender 

        col 
0   I want a hug 
1 a kiss would be great 

mapping = df_syn.assign(
    synonym=df_syn.synonym.radd(r'\b').add(r'\b') 
).set_index('synonym').word.to_dict() 

df.replace({'col': mapping}, regex=True) 

         col 
0   I want a warm 
1 a tender would be great