將一列與python熊貓分成多列

我想在我的數據框中將一列分成多列。它用逗號分隔。將一列與python熊貓分成多列

我想在excel中應用'text to columns'函數。

我將分割列後給出自己的標題。 'Turnstile'是我專欄的名字。我有：

(A006, R079, 00-00-04, 5 AVE-59 ST)

每行數據的類型。最後，我想有：

A006 R079 00-00-04 5 AVE-59 ST

與我將創建的標題。

我最後嘗試：

df.Turnstile.str.split().tolist()

但所有我已經是「男」

當我檢查「旋轉門」列的類型，它說「對象。我想這一系列轉換成字符串：

df['Turnstile'] = df[['Turnstile'].astype(str)]

，但它給了我：

AttributeError: 'list' object has no attribute 'astype'

請指教。

謝謝。

來源

2015-09-27 lorelai

當你做你這話是什麼讓類型（df.Turnstile.values [0]）'？它說的元組是 – maxymoo

。 @maxymoo – lorelai

你可以檢查每個元組條目的dtype嗎？即'[式（df.Turnstile.values [0] [I]），其中i在範圍（4）' – maxymoo

這裏有幾個選項，如果你的數據是真正的csv格式，比如從Excel導出，你可以使用pandas.read_csv來讀入文件，它會根據列分隔符自動分割成列。

如果你的數據是用逗號字符串列，可以使用str.split重新定義列，但據我所知，你需要傾倒結果列作爲原料Python列表，然後重新轉換爲數據幀：

import pandas as pd 
df = pd.DataFrame([["A006, R079, 00-00-04, 5 AVE-59 ST"]]) 
df2 = pd.DataFrame(df[0].str.split(',').tolist())

來源

2015-09-27 22:43:21 maxymoo

它給了我KeyError。不起作用。 @maxymoo 正如我所提到的，我把---> df.Turnstile.str.split（）。tolist（）它給了我所有的'南' – lorelai

也許看這個另一種方式是一個元組的列轉換爲DataFrame，像這樣：

In [10]: DataFrame(df['Turnstile'].tolist()) 
Out[10]: 
     0  1   2   3 
0 A006 R079 00-00-04 5 AVE-59 ST 
1 A006 R079 00-00-04 5 AVE-59 ST 
2 A006 R079 00-00-04 5 AVE-59 ST 
3 A006 R079 00-00-04 5 AVE-59 ST 
4 A006 R079 00-00-04 5 AVE-59 ST 
5 A006 R079 00-00-04 5 AVE-59 ST 
6 A006 R079 00-00-04 5 AVE-59 ST 
7 A006 R079 00-00-04 5 AVE-59 ST 
8 A006 R079 00-00-04 5 AVE-59 ST 
9 A006 R079 00-00-04 5 AVE-59 ST

如果是這樣的話，這裏的元組的列轉換爲爲例10，並將其添加回原始數據幀：

import numpy as np 
import pandas as pd 
from pandas import Series, DataFrame 

# create a fake dataframe, repeating the tuple given in the example 
In [2]: df = DataFrame(data={'Observations': np.random.randn(10) * np.arange(10), 
...:  'Turnstile': (('A006', 'R079', '00-00-04', '5 AVE-59 ST'),)*10}) 

In [3]: df.head() 
Out[3]: 
    Observations       Turnstile 
0  -0.000000 (A006, R079, 00-00-04, 5 AVE-59 ST) 
1  -0.022668 (A006, R079, 00-00-04, 5 AVE-59 ST) 
2  -2.380515 (A006, R079, 00-00-04, 5 AVE-59 ST) 
3  -4.209983 (A006, R079, 00-00-04, 5 AVE-59 ST) 
4  3.932902 (A006, R079, 00-00-04, 5 AVE-59 ST) 

# all at once turn the column of tuples into a dataframe and concat that with the original df 
In [4]: df = pd.concat([df,DataFrame(df['Turnstile'].tolist())], axis=1, join='outer') 

In [5]: df.head() 
Out[5]: 
     Observations       Turnstile  0  1   2 \ 
    0  -0.000000 (A006, R079, 00-00-04, 5 AVE-59 ST) A006 R079 00-00-04 
    1  -0.022668 (A006, R079, 00-00-04, 5 AVE-59 ST) A006 R079 00-00-04 
    2  -2.380515 (A006, R079, 00-00-04, 5 AVE-59 ST) A006 R079 00-00-04 
    3  -4.209983 (A006, R079, 00-00-04, 5 AVE-59 ST) A006 R079 00-00-04 
    4  3.932902 (A006, R079, 00-00-04, 5 AVE-59 ST) A006 R079 00-00-04 

     3 
0 5 AVE-59 ST 
1 5 AVE-59 ST 
2 5 AVE-59 ST 
3 5 AVE-59 ST 
4 5 AVE-59 ST 

# i assume you don't need this column anymore 
In [6]: del df['Turnstile']

如果這樣工作，您當然可以根據需要命名新列。

來源

2015-09-28 16:25:59 measureallthethings

謝謝@measurealltheings – lorelai

@measureallthethings這是一個比我更好的答案;我沒有意識到你可以從一個元組列表中創建一個數據框 – maxymoo

嘗試做df.Turnstile.str.split（ ''）

來源

2017-06-02 08:03:38 lightyagami96

當回答一個問題時，請提供與你的代碼相關的解釋。有些人可能不瞭解你的代碼，或者沒有看到它如何回答這個問題。看[如何寫出一個好的答案]（https://stackoverflow.com/help/how-to-answer） – Nuageux

將一列與python熊貓分成多列

回答

相關問題