熊貓concat：ValueError：傳遞值的形狀是等等，指數意味着blah2

我試圖合併（熊貓14.1）數據框和一系列。該系列應該與一些NA一起形成一個新列（因爲該系列的索引值是數據框的索引值的子集）。熊貓concat：ValueError：傳遞值的形狀是等等，指數意味着blah2

這適用於一個玩具的例子，但不適用於我的數據（詳述如下）。

實施例：

import pandas as pd 
import numpy as np 

df1 = pd.DataFrame(np.random.randn(6, 4), columns=['A', 'B', 'C', 'D'], index=pd.date_range('1/1/2011', periods=6, freq='D')) 
df1 

A B C D 
2011-01-01 -0.487926 0.439190 0.194810 0.333896 
2011-01-02 1.708024 0.237587 -0.958100 1.418285 
2011-01-03 -1.228805 1.266068 -1.755050 -1.476395 
2011-01-04 -0.554705 1.342504 0.245934 0.955521 
2011-01-05 -0.351260 -0.798270 0.820535 -0.597322 
2011-01-06 0.132924 0.501027 -1.139487 1.107873 

s1 = pd.Series(np.random.randn(3), name='foo', index=pd.date_range('1/1/2011', periods=3, freq='2D')) 
s1 

2011-01-01 -1.660578 
2011-01-03 -0.209688 
2011-01-05 0.546146 
Freq: 2D, Name: foo, dtype: float64 

pd.concat([df1, s1],axis=1) 

A B C D foo 
2011-01-01 -0.487926 0.439190 0.194810 0.333896 -1.660578 
2011-01-02 1.708024 0.237587 -0.958100 1.418285 NaN 
2011-01-03 -1.228805 1.266068 -1.755050 -1.476395 -0.209688 
2011-01-04 -0.554705 1.342504 0.245934 0.955521 NaN 
2011-01-05 -0.351260 -0.798270 0.820535 -0.597322 0.546146 
2011-01-06 0.132924 0.501027 -1.139487 1.107873 NaN

與數據的情況（見下面）似乎基本相同 - concatting與DatetimeIndex其值是數據幀的子集的系列。但它在標題中給出了ValueError（blah1 =（5,286）blah2 =（5,276））。爲什麼它不工作？：

In[187]: df.head() 
Out[188]: 
high low loc_h loc_l 
time     
2014-01-01 17:00:00 1.376235 1.375945 1.376235 1.375945 
2014-01-01 17:01:00 1.376005 1.375775 NaN NaN 
2014-01-01 17:02:00 1.375795 1.375445 NaN 1.375445 
2014-01-01 17:03:00 1.375625 1.375515 NaN NaN 
2014-01-01 17:04:00 1.375585 1.375585 NaN NaN 
In [186]: df.index 
Out[186]: 
<class 'pandas.tseries.index.DatetimeIndex'> 
[2014-01-01 17:00:00, ..., 2014-01-01 21:30:00] 
Length: 271, Freq: None, Timezone: None 

In [189]: hl.head() 
Out[189]: 
2014-01-01 17:00:00 1.376090 
2014-01-01 17:02:00 1.375445 
2014-01-01 17:05:00 1.376195 
2014-01-01 17:10:00 1.375385 
2014-01-01 17:12:00 1.376115 
dtype: float64 

In [187]:hl.index 
Out[187]: 
<class 'pandas.tseries.index.DatetimeIndex'> 
[2014-01-01 17:00:00, ..., 2014-01-01 21:30:00] 
Length: 89, Freq: None, Timezone: None 

In: pd.concat([df, hl], axis=1) 
Out: [stack trace] ValueError: Shape of passed values is (5, 286), indices imply (5, 276)

來源

2014-12-31 birone

你嘗試過'append'而不是'concat'嗎？如果我正確地理解了「ValueError」，它說有286行數據，但數據幀的索引需要276行。嘗試檢查'len（df.index）'和'len（h1.index）'。 –

df.append（hl）因TypeError失敗：'NoneType'對象不可迭代。但後來我嘗試加入 - 謝謝！ :) – birone

沒問題。確保將您的答案標記爲正確，以便將來SO用戶可以在遇到類似問題時快速找到您的解決方案。 –

Aus_lacy的崗位給了我嘗試的方法有關的想法，其中加入沒有工作：

In [196]: 

hl.name = 'hl' 
Out[196]: 
'hl' 
In [199]: 

df.join(hl).head(4) 
Out[199]: 
high low loc_h loc_l hl 
2014-01-01 17:00:00 1.376235 1.375945 1.376235 1.375945 1.376090 
2014-01-01 17:01:00 1.376005 1.375775 NaN NaN NaN 
2014-01-01 17:02:00 1.375795 1.375445 NaN 1.375445 1.375445 
2014-01-01 17:03:00 1.375625 1.375515 NaN NaN NaN

一些洞察爲什麼CONCAT工作的例子但不是這個數據會很好，但！

來源

2014-12-31 13:20:03 birone

我有類似的問題（join工作，但concat失敗）。在df1和s1（例如df1.index.is_unique）

刪除重複的索引值（例如，df.drop_duplicates(inplace=True)），或在這裏https://stackoverflow.com/a/34297689/7163376的方法之一

檢查重複索引值應該解決這個問題。

來源

2015-01-12 20:34:09 lmart999

工作感謝！我這樣做：df = pd.concat（[df1，df2]，axis = 1，join_axes = [df1.index]）。如果我在df2中有dups，那麼我得到這個錯誤。具有意義，因爲它不知道如何在兩個DF中映射多個重複索引。 – sparrow

我的問題在不同的指數，下面的代碼解決了我的問題。

df1.reset_index(drop=True) 
df2.reset_index(drop=True) 
df = pd.concat([df1, df2], axis=1)

來源

2017-12-27 11:56:19 flow

您的索引可能包含重複的值。

import pandas as pd 

T1_INDEX = [ 
    0, 
    1, # <= !!! if I write e.g.: "0" here then it fails 
    0.2, 
] 
T1_COLUMNS = [ 
    'A', 'B', 'C', 'D' 
] 
T1 = [ 
    [1.0, 1.1, 1.2, 1.3], 
    [2.0, 2.1, 2.2, 2.3], 
    [3.0, 3.1, 3.2, 3.3], 
] 

T2_INDEX = [ 
    1.2, 
    2.11, 
] 

T2_COLUMNS = [ 
    'D', 'E', 'F', 
] 
T2 = [ 
    [54.0, 5324.1, 3234.2], 
    [55.0, 14.5324, 2324.2], 
    # [3.0, 3.1, 3.2], 
] 
df1 = pd.DataFrame(T1, columns=T1_COLUMNS, index=T1_INDEX) 
df2 = pd.DataFrame(T2, columns=T2_COLUMNS, index=T2_INDEX) 


print(pd.concat([pd.DataFrame({})] + [df2, df1], axis=1))

來源

2018-02-09 11:17:14

熊貓concat：ValueError：傳遞值的形狀是等等，指數意味着blah2

回答

相關問題