我的數據集有很多列包含$逗號,例如逗號, $ 150,000.50。一旦我導入數據集:
datasets = pd.read_csv('salaries-by-college-type.csv')
由於一串值爲$ values的列,因此imputer對象失敗。我如何糾正它在Python程序
這是我的數據集。除了學校類型休息都有$逗號逗號。有沒有從這些欄刪除這些$和逗號一個通用的方法值
School Type 269 non-null object
Starting Median Salary 269 non-null float64
Mid-Career Median Salary 269 non-null float64
Mid-Career 10th Percentile Salary 231 non-null float64
Mid-Career 25th Percentile Salary 269 non-null float64
Mid-Career 75th Percentile Salary 269 non-null float64
Mid-Career 90th Percentile Salary 231 non-null float64
這裏是我的數據集的樣本:
School Type Starting Median Salary Mid-Career Median Salary Mid-Career 10th Percentile Salary Mid-Career 25th Percentile Salary Mid-Career 75th Percentile Salary Mid-Career 90th Percentile Salary
Engineering $72,200.00 $126,000.00 $76,800.00 $99,200.00 $168,000.00 $220,000.00
Engineering $75,500.00 $123,000.00 N/A $104,000.00 $161,000.00 N/A
Engineering $71,800.00 $122,000.00 N/A $96,000.00 $180,000.00 N/A
Engineering $62,400.00 $114,000.00 $66,800.00 $94,300.00 $143,000.00 $190,000.00
Engineering $62,200.00 $114,000.00 N/A $80,200.00 $142,000.00 N/A
Engineering $61,000.00 $114,000.00 $80,000.00 $91,200.00 $137,000.00 $180,000.00
'df.column = df.column.str.strip('$')' –
謝謝...... 15,000.50中的逗號怎麼樣? – Kda
'... strip(「,」)' – Fallenreaper