我有下面的代碼片段:有選擇地刪除空白字符串後在Python
import pandas as pd
df = pd.DataFrame([{'LastName':'VAN HOUTEN'},
{'LastName':"O'BOYLE"},
{'LastName':'ESTEVAN-GONZALEZ'},
{'LastName':'RODRIGO TEIXEIRA'},
{'LastName':'ESTEBAN GONZALEZ'},
{'LastName':'O ROURKE'},
{'LastName':'RODRIGO-TEIXEIRA'}])
delete_space_after_list = ['VAN','O']
df['NewName'] = df['LastName'].str.replace("'"," ")
for s in delete_space_after_list[:]:
df['NewName'] = df['NewName'].str.replace(s + ' ', s)
df['NewName'] = df['NewName'].str.replace('-'," ")
df['NewName'] = df['NewName'].str.split().str.get(0)
運行這段代碼給我下面的結果:
Index LastName NewName
0 VAN HOUTEN VANHOUTEN
1 O'BOYLE OBOYLE
2 ESTEVAN-GONZALEZ ESTEVAN
3 RODRIGO TEIXEIRA RODRIGOTEIXEIRA
4 ESTEVAN GONZALEZ ESTEVANGONZALEZ
5 O ROURKE OROURKE
6 RODRIGO-TEIXEIRA RODRIGO
但是期望的輸出是這
Index LastName DesiredName
0 VAN HOUTEN VANHOUTEN
1 O'BOYLE OBOYLE
2 ESTEVAN-GONZALEZ ESTEVAN
3 RODRIGO TEIXEIRA RODRIGO
4 ESTEVAN GONZALEZ ESTEVAN
5 O ROURKE OROURKE
6 RODRIGO-TEIXEIRA RODRIGO
它消除了RODRIGO之後的空間(由於LastName末尾的'O')a將它與'TEIXEIRA'串聯起來,同樣消除ESTEVAN後的空間(因爲'0123'末尾的'VAN')並將其與'GONZALEZ'連接起來。但是,它正確地消除了其他名稱中的空間。
我如何獲得此代碼正確刪除空白,因爲它確實爲VAN HOUTEN,奧博伊爾,ESTEVAN岡薩雷斯,O- ROURKE,& RODRIGO-TEIXEIRA而ESTEVAN GONZALEZ & RODRIGO TEIXEIRA後不刪除空格?