如何擺脫Python中列值的$符號

我的數據集有很多列包含$逗號，例如逗號， $ 150,000.50。一旦我導入數據集：

datasets = pd.read_csv('salaries-by-college-type.csv')

由於一串值爲$ values的列，因此imputer對象失敗。我如何糾正它在Python程序

這是我的數據集。除了學校類型休息都有$逗號逗號。有沒有從這些欄刪除這些$和逗號一個通用的方法值

School Type       269 non-null object 
Starting Median Salary    269 non-null float64 
Mid-Career Median Salary    269 non-null float64 
Mid-Career 10th Percentile Salary 231 non-null float64 
Mid-Career 25th Percentile Salary 269 non-null float64 
Mid-Career 75th Percentile Salary 269 non-null float64 
Mid-Career 90th Percentile Salary 231 non-null float64

這裏是我的數據集的樣本：

School Type Starting Median Salary Mid-Career Median Salary Mid-Career 10th Percentile Salary Mid-Career 25th Percentile Salary Mid-Career 75th Percentile Salary Mid-Career 90th Percentile Salary 
Engineering $72,200.00 $126,000.00  $76,800.00 $99,200.00 $168,000.00  $220,000.00 
Engineering $75,500.00 $123,000.00  N/A $104,000.00  $161,000.00  N/A 
Engineering $71,800.00 $122,000.00  N/A $96,000.00 $180,000.00  N/A 
Engineering $62,400.00 $114,000.00  $66,800.00 $94,300.00 $143,000.00  $190,000.00 
Engineering $62,200.00 $114,000.00  N/A $80,200.00 $142,000.00  N/A 
Engineering $61,000.00 $114,000.00  $80,000.00 $91,200.00 $137,000.00  $180,000.00

來源

2017-10-06 Kda

'df.column = df.column.str.strip（'$'）' –

謝謝...... 15,000.50中的逗號怎麼樣？ – Kda

'... strip（「，」）' – Fallenreaper

假設你有一個csv，看起來像這樣。
注意：我真的不知道你的csv是什麼樣子。確保相應地調整read_csv參數。最具體而言，參數爲sep。

h1|h2 
a|$1,000.99 
b|$500,000.00

使用在pd.read_csv
的converters參數傳遞一個字典，你想轉換爲鍵的列的名稱和是否轉換爲數值的功能。

pd.read_csv(
    'salaries-by-college-type.csv', sep='|', 
    converters=dict(h2=lambda x: float(x.strip('$').replace(',', ''))) 
) 

    h1   h2 
0 a 1000.99 
1 b 500000.00

或者，假設您導入數據框已經

df = pd.read_csv(
    'salaries-by-college-type.csv', sep='|' 
)

然後使用pd.Series.str.replace

df.h2 = df.h2.str.replace('[^\d\.]', '').astype(float) 

df 

    h1   h2 
0 a 1000.99 
1 b 500000.00

或者pd.DataFrame.replace

df.replace(dict(h2='[^\d\.]'), '', regex=True).astype(dict(h2=float)) 

    h1   h2 
0 a 1000.99 
1 b 500000.00

來源

2017-10-06 00:38:01 piRSquared

這裏是我的數據集，除了第一列休息都有$和逗號值，我如何得到一般的更正。 – Kda

學校類型269非空對象啓動工資中位數269非空float64 中等職業平均年薪269非空float64 中間事業第10個百分工資231非空float64 中間事業第25個百分工資269非空float64 中間事業75百分位數工資269非空float64 中間事業90分位點工資231非空float64 – Kda

@Kda你需要編輯你的問題和過去的數據存在。 – piRSquared

如何擺脫Python中列值的$符號

回答

相關問題