2017-03-09 91 views
0

所以我試圖編輯整個列單元格,將單元格從包含整數和字符串的列更改爲整數部分。從數據幀如何將數據幀的列從浮點數更改爲整數(Pandas)

實際列:

0       11212; xxxxxxxxxx xxxxxxxx 
1       11212; xxxxxxxxxx xxxxxxxx 
2       11212; xxxxxxxxxx xxxxxxxx 
3       11212; xxxxxxxxxx xxxxxxxx  
8     667788; xxxxxxx xxxxxxxxxxxxx xxxxxx 
9     55555; xxxxxxx xxxxxxxxxxxxx xxxxxx 
10     55555; xxxxxxx xxxxxxxxxxxxx xxxxxx 
11     55555; xxxxxxx xxxxxxxxxxxxx xxxxxx 
12     33333; xxxxxxx xxxxxxxxxxxxx xxxxxx 
13     333; xxx xxxxx @ xxx xxx 2 xxxx 
14     9991; xxxx; xxxxxx xxxxx xxxx @ 2 xxx 
18      1635; vvvvvvvvvvvv vvvvvv 10 
19      1635; vvvvvvvvvvvv vvvvvv 10 
20      1635; vvvvvvvvvvvv vvvvvv 10 
21      1635; vvvvvvvvvvvv vvvvvv 10  
32      1712; Cxxxx xxxxxxxx; xxx 0 
33      1712; Cxxxx xxxxxxxx; xxx 0 
34      1712; Cxxxx xxxxxxxx; xxx 0 
35      1712; Cxxxx xxxxxxxx; xxx 0 

這是我正在

import pandas as pd 
    import re 

    # import excel file from Trello 
    xlsx = pd.ExcelFile("/home/deon/Documents/Work_Stuff/Trello.xls") 
    # create data frame from excel file on sheet 1 
    df2 = pd.read_excel(xlsx,'Sheet1') 
    df3 = pd.DataFrame(data=df2) 

    # delete columns not relative to us 
    df3.drop(df3.columns[[0,5,10,11]],inplace=True,axis=1) 
    df3.columns= "Date*", "Due date", "Week*", "Card", "Board", "List", "S", "E 1st" 

    df3[:, 6] = df3.iloc[:,6].apply(lambda x: x.split(';')[0]) 
    print df2.head() 


# Also tried 
    digits = df3.iloc[:, 4].apply(lambda x: re.findall('\d+', str(x))) 
    df3.iloc[:, 4] = digits.str.get(0).astype(int) 
    print df3.head() 
+0

這在我看來,第一個代碼塊會給你你想要的,但是以字符串形式。你只需要將它轉換爲int。你從中得到了什麼輸出? – ASignor

+0

AttributeError:'numpy.float64'對象沒有屬性'split' –

+0

我收到的第二個示例:ValueError:無法將NA轉換爲整數 –

回答

0

你有分裂的字符串的總體思路的代碼,你的麻煩引用到數據幀時就來了。東西沿線的更多:

代碼:

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0])) 

測試代碼:

data = [x.strip() for x in """ 
      11212; xxxxxxxxxx xxxxxxxx 
      11212; xxxxxxxxxx xxxxxxxx 
      11212; xxxxxxxxxx xxxxxxxx 
      11212; xxxxxxxxxx xxxxxxxx 
    667788; xxxxxxx xxxxxxxxxxxxx xxxxxx 
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx 
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx 
""".split('\n')[1:-1]] 

import pandas as pd 
df = pd.DataFrame(data=data, columns=['raw_string']) 

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0])) 

print(df.head()) 

結果:

       raw_string number 
0   11212; xxxxxxxxxx xxxxxxxx 11212 
1   11212; xxxxxxxxxx xxxxxxxx 11212 
2   11212; xxxxxxxxxx xxxxxxxx 11212 
3   11212; xxxxxxxxxx xxxxxxxx 11212 
4 667788; xxxxxxx xxxxxxxxxxxxx xxxxxx 667788 
相關問題