變量名中不能使用的歐元符號:
Identifiers (also referred to as names) are described by the following lexical definitions:
identifier ::= (letter|"_") (letter | digit | "_")*
letter ::= lowercase | uppercase
lowercase ::= "a"..."z"
uppercase ::= "A"..."Z"
digit ::= "0"..."9"
您將需要使用一個字符串:其實
df["price_€"] ...
大熊貓有沒有問題,我與歐元符號:
import pandas as pd
df = pd.DataFrame([[1, 2]], columns=["£", "€"])
print(df["€"])
print(df["£"])
0 2
Name: €, dtype: int64
0 1
Name: £, dtype: int64
fil e是CP1252編碼,所以你需要指定編碼:
mport pandas as pd
iimport codecs
df = pd.read_csv("PPR-2015.csv",header=0,encoding="cp1252")
print(df.columns)
Index([u'Date of Sale (dd/mm/yyyy)', u'Address', u'Postal Code', u'County',
u'Price (€)', u'Not Full Market Price', u'VAT Exclusive', u'Description of Property', u'Property Size Description'], dtype='object')
print(df[u'Price (€)'])
0 €138,000.00
1 €270,000.00
2 €67,000.00
3 €900,000.00
4 €176,000.00
5 €155,000.00
6 €100,000.00
7 €120,000.00
8 €470,000.00
9 €140,000.00
10 €592,000.00
11 €85,000.00
12 €422,500.00
13 €225,000.00
14 €55,000.00
...
17433 €262,000.00
17434 €155,000.00
17435 €750,000.00
17436 €96,291.69
17437 €112,000.00
17438 €350,000.00
17439 €190,000.00
17440 €25,000.00
17441 €100,000.00
17442 €75,000.00
17443 €46,000.00
17444 €175,000.00
17445 €48,500.00
17446 €150,000.00
17447 €400,000.00
Name: Price (€), Length: 17448, dtype: object
然後改變浮動:
df[u'Price (€)'] = df[u'Price (€)'].str.replace(ur'[€,]'), '').astype('float')
print(df['Price (€)'.decode("utf-8")])
輸出:
0 138000
1 270000
2 67000
3 900000
4 176000
5 155000
6 100000
7 120000
8 470000
9 140000
10 592000
11 85000
12 422500
13 225000
14 55000
...
17433 262000.00
17434 155000.00
17435 750000.00
17436 96291.69
17437 112000.00
17438 350000.00
17439 190000.00
17440 25000.00
17441 100000.00
17442 75000.00
17443 46000.00
17444 175000.00
17445 48500.00
17446 150000.00
17447 400000.00
Name: Price (€), Length: 17448, dtype: float64
你是說當你打印你看到的'data_ '數據框?如果是這樣,那麼你的問題是編碼 –
嗨Padraic,是的,當我打印框架,我看到'price_ '。有沒有辦法解決這個問題,還是我需要手動更改輸入文件? – Marcus
你是如何創建數據框的? –