2014-12-24 189 views
1

填滿它,我有這樣的熊貓數據框:添加到大熊貓的數據幀列與NA

  SourceDomain       1 2   3 
0 www.theguardian.com  profile.theguardian.com 1 Directed 
1 www.theguardian.com membership.theguardian.com 2 Directed 
2 www.theguardian.com subscribe.theguardian.com 3 Directed 
3 www.theguardian.com   www.google.co.uk 4 Directed 
4 www.theguardian.com  jobs.theguardian.com 5 Directed 

我想補充一個新的列是這樣創造了一個熊貓系列:

Weights = Weights.value_counts() 

然而,當我嘗試添加使用edgesFile[4] = Weights新列其與NA而非實際的值來填充它:

  SourceDomain       1 2   3 4 
0 www.theguardian.com  profile.theguardian.com 1 Directed NaN 
1 www.theguardian.com membership.theguardian.com 2 Directed NaN 
2 www.theguardian.com subscribe.theguardian.com 3 Directed NaN 
3 www.theguardian.com   www.google.co.uk 4 Directed NaN 
4 www.theguardian.com  jobs.theguardian.com 5 Directed NaN 

如何添加保留值的新列? 謝謝?

達尼

回答

1

你得到的NaN,因爲Weights了指數不與edgesFile指數相匹配。如果你想熊貓忽略Weights.index,只是貼上爲了值再通過底層NumPy的陣列,而不是:

edgesFile[4] = Weights.values 

這裏是演示了差的例子:

In [14]: df = pd.DataFrame(np.arange(4)*10, index=list('ABCD')) 

In [15]: df 
Out[15]: 
    0 
A 0 
B 10 
C 20 
D 30 

In [16]: s = pd.Series(np.arange(4), index=list('CDEF')) 

In [17]: s 
Out[17]: 
C 0 
D 1 
E 2 
F 3 
dtype: int64 

在這裏,我們看到熊貓對準指數:

In [18]: df[4] = s 

In [19]: df 
Out[19]: 
    0 4 
A 0 NaN 
B 10 NaN 
C 20 0 
D 30 1 

這裏,大熊貓只是貼在s值入列:

In [20]: df[4] = s.values 

In [21]: df 
Out[21]: 
    0 4 
A 0 0 
B 10 1 
C 20 2 
D 30 3 
+0

謝謝!它正在工作,我剛剛學到了一件非常有趣的事情。我會盡快接受答案。 –

0

這是你的問題的小例子:

您可以在現有的數據幀

與列名添加新列
>>> df = DataFrame([[1,2,3],[4,5,6]], columns = ['A', 'B', 'C']) 
>>> df 
    A B C 
0 1 2 3 
1 4 5 6 

>>> s = Series([7,8]) 
>>> s 
0 7 
1 8 
2 9 

>>> df['D']=s 
>>> df 
    A B C D 
0 1 2 3 7 
1 4 5 6 8 

或者,您可以從系列使數據幀和CONCAT然後

>>> df = DataFrame([[1,2,3],[4,5,6]]) 
>>> df 
    0 1 2 
0 1 2 3 
1 4 5 6 

>>> s = DataFrame(Series([7,8]), columns=['4']) # if you don't provide column name, default name will be 0 
>>> s 
    0 
0 7 
1 8 

>>> df = pd.concat([df,s], axis=1) 
>>> df 
    0 1 2 0 
0 1 2 3 7 
1 4 5 6 8 

希望這將有助於