2013-03-12 58 views
1

海,熊貓加入/合併'只有有效索引唯一有價值的索引'

我得到一個非常強大的聯接操作錯誤。我試圖合併(left_index,right_index)以及相同的結果。

索引是相同的(按設計),通過兩個索引上的index.is_unique(TRUE)和index.get_duplicates()(EMPTY)進行檢查。

Basic版本:

df1.join(series) 
merge(df1, series_as_df, 

print tempres.index 

[2013年1月14日17時04分45秒,......,2013年1月14日16時53分05秒] 長度:89,頻率:無,時區:無


奇怪的是打印值: 打印tempres.index.val ues [1970-01-16 121:04:45 1970-01-16 121:04:35 1970-01-16 121:04:25 1970-01-16 121:04:15 1970-01-16 121 :04:05 1970-01-16 121:03:55 1970-01-16 121:03:45 1970-01-16 121:03:35 1970-01-16 121:03:25 1970-01- 16 121:03:15 1970-01-16 121:03:05 1970-01-16 121:02:55 1970-01-16 121:02:45 1970-01-16 121:02:35 1970-01 -16 121:02:25 1970-01-16 121:02:15 1970-01-16 121:02:05 1970-01-16 121:01:55 1970-01-16 121:01:45 1970 -01-16 121:01:35 1970年1月16日121:01:25 ...]

如果需要我可以添加酸洗系列和df ...

採用最新版本的熊貓0.10.x

感謝,

呂克

我的代碼(切從更大的代碼)

XYTparams (existing dataframe) 
prep_functions[funcname] = [list of values, same length as XYTparams] 

iSeries = Series(prep_functions[funcname], index = XYTparams.index, name = funcname) 
XYTparams = XYTparams.join(iSeries) 

審查我的問題:

我使用合併和在基本DataFrame上連續連接。在嘗試下一次合併/加入時,我開始出現錯誤。我無法在一個簡單的測試中重現這一點,但我在問題開始之前保存了數據框。

我找不到問題所在。

base_df = load('SPOparams.pic') 
lookup_df = load('lookup.pic') 

print base_df 
print lookup_df 

print base_df.count() 

print base_df['VKCSKEY1'] 
print lookup_df['traf_key'] 

# reset index does not change a thing 
base_df = base_df.reset_index(drop=True) 

print base_df.index 
print base_df.index.get_duplicates() 
print lookup_df.index 
print lookup_df.index.get_duplicates() 


# checking value matches 
for k in lookup_df['traf_key']: 
    print k, k in base_df['VKCSKEY1'].values 

# why does this merge is unsuccesfull ??? 
# in any combination of the parameters 
df_result =merge(base_df, lookup_df, 
      how='left', 
      #how = 'outer', 
      left_on ='VKCSKEY1', 
      right_on ='traf_key', 
      #left_index=True, 
      #right_index = True, 
      #sort=True, 
      #suffixes=('', '.m'), copy=True 
      ) 
print df_result 

輸出:

1.6.1 
0.10.1 
<class 'pandas.core.frame.DataFrame'> 
Int64Index: 89 entries, 0 to 88 
Data columns: 
T      89 non-null values 
X      89 non-null values 
Y      89 non-null values 
precip_quantity_1hour 89 non-null values 
pressure     89 non-null values 
rel_humidity    89 non-null values 
temp      89 non-null values 
temp_max     0 non-null values 
temp_min     0 non-null values 
wind_direction   89 non-null values 
wind_speed    89 non-null values 
BC_TRAF     89 non-null values 
closest     89 non-null values 
closest.m    89 non-null values 
AGGP.P50_ID    89 non-null values 
AGGP.FUNC_CLASS   89 non-null values 
AGGP.SPEED_CAT   89 non-null values 
LINK_ID     89 non-null values 
FUNC_CLASS    89 non-null values 
SPEED_CAT    89 non-null values 
AR_AUTO     89 non-null values 
AR_BUS     89 non-null values 
AR_TAXIS     89 non-null values 
AR_CARPOOL    89 non-null values 
AR_PEDEST    89 non-null values 
AR_TRUCKS    89 non-null values 
STCA20_PCT    89 non-null values 
VKC_LINKNR    89 non-null values 
TRVIC150R1    89 non-null values 
closest.m    89 non-null values 
closest.m.m    89 non-null values 
VKCP.LINK_ID    89 non-null values 
VKCP.FUNC_CLASS   89 non-null values 
VKCP.SPEED    89 non-null values 
VKCP.LINKNR    89 non-null values 
VKCP.TWIN_ID    89 non-null values 
VKCSKEY1     89 non-null values 
dtypes: datetime64[ns](1), float64(13), int64(9), object(14) 
<class 'pandas.core.frame.DataFrame'> 
Index: 30 entries, (60744, 0) to (58314, 0) 
Data columns: 
traf_key  30 non-null values 
weekday_nr 30 non-null values 
linknr  30 non-null values 
weekday  30 non-null values 
vr0   30 non-null values 
vr1   30 non-null values 
vr2   30 non-null values 
vr3   30 non-null values 
vr4   30 non-null values 
vr5   30 non-null values 
vr6   30 non-null values 
vr7   30 non-null values 
vr8   30 non-null values 
vr9   30 non-null values 
vr10   30 non-null values 
vr11   30 non-null values 
vr12   30 non-null values 
vr13   30 non-null values 
vr14   30 non-null values 
vr15   30 non-null values 
vr16   30 non-null values 
vr17   30 non-null values 
vr18   30 non-null values 
vr19   30 non-null values 
vr20   30 non-null values 
vr21   30 non-null values 
vr22   30 non-null values 
vr23   30 non-null values 
au0   30 non-null values 
au1   30 non-null values 
au2   30 non-null values 
au3   30 non-null values 
au4   30 non-null values 
au5   30 non-null values 
au6   30 non-null values 
au7   30 non-null values 
au8   30 non-null values 
au9   30 non-null values 
au10   30 non-null values 
au11   30 non-null values 
au12   30 non-null values 
au13   30 non-null values 
au14   30 non-null values 
au15   30 non-null values 
au16   30 non-null values 
au17   30 non-null values 
au18   30 non-null values 
au19   30 non-null values 
au20   30 non-null values 
au21   30 non-null values 
au22   30 non-null values 
au23   30 non-null values 
sn0   30 non-null values 
sn1   30 non-null values 
sn2   30 non-null values 
sn3   30 non-null values 
sn4   30 non-null values 
sn5   30 non-null values 
sn6   30 non-null values 
sn7   30 non-null values 
sn8   30 non-null values 
sn9   30 non-null values 
sn10   30 non-null values 
sn11   30 non-null values 
sn12   30 non-null values 
sn13   30 non-null values 
sn14   30 non-null values 
sn15   30 non-null values 
sn16   30 non-null values 
sn17   30 non-null values 
sn18   30 non-null values 
sn19   30 non-null values 
sn20   30 non-null values 
sn21   30 non-null values 
sn22   30 non-null values 
sn23   30 non-null values 
dtypes: float64(24), int64(50), object(2) 
T      89 
X      89 
Y      89 
precip_quantity_1hour 89 
pressure     89 
rel_humidity    89 
temp      89 
temp_max     0 
temp_min     0 
wind_direction   89 
wind_speed    89 
BC_TRAF     89 
closest     89 
closest.m    89 
AGGP.P50_ID    89 
AGGP.FUNC_CLASS   89 
AGGP.SPEED_CAT   89 
LINK_ID     89 
FUNC_CLASS    89 
SPEED_CAT    89 
AR_AUTO     89 
AR_BUS     89 
AR_TAXIS     89 
AR_CARPOOL    89 
AR_PEDEST    89 
AR_TRUCKS    89 
STCA20_PCT    89 
VKC_LINKNR    89 
TRVIC150R1    89 
closest.m    89 
closest.m.m    89 
VKCP.LINK_ID    89 
VKCP.FUNC_CLASS   89 
VKCP.SPEED    89 
VKCP.LINKNR    89 
VKCP.TWIN_ID    89 
VKCSKEY1     89 
0  (60744, 0) 
1  (60744, 0) 
2  (60744, 0) 
3  (60750, 0) 
4  (60768, 0) 
5  (60768, 0) 
6  (60758, 0) 
7  (60758, 0) 
8  (69223, 0) 
9  (69223, 0) 
10 (69223, 0) 
11 (64265, 0) 
12 (64265, 0) 
13 (64265, 0) 
14 (64265, 0) 
15 (64265, 0) 
16 (64265, 0) 
17 (64265, 0) 
18 (64265, 0) 
19 (64265, 0) 
20 (64216, 0) 
21 (64216, 0) 
22 (64216, 0) 
23 (64216, 0) 
24 (64216, 0) 
25 (64216, 0) 
26 (64216, 0) 
27 (64216, 0) 
28 (64216, 0) 
29 (57085, 0) 
30 (57085, 0) 
31 (57085, 0) 
32 (57085, 0) 
33 (57085, 0) 
34 (57085, 0) 
35 (57014, 0) 
36 (57033, 0) 
37 (57033, 0) 
38 (64065, 0) 
39 (64065, 0) 
40 (64065, 0) 
41 (64065, 0) 
42 (64065, 0) 
43 (57070, 0) 
44 (64062, 0) 
45 (64062, 0) 
46 (64062, 0) 
47 (64062, 0) 
48 (57070, 0) 
49 (64061, 0) 
50 (64061, 0) 
51 (64061, 0) 
52 (64061, 0) 
53 (59849, 0) 
54 (59415, 0) 
55 (58487, 0) 
56 (58054, 0) 
57 (58054, 0) 
58 (58054, 0) 
59 (52551, 0) 
60 (58054, 0) 
61 (58054, 0) 
62 (58054, 0) 
63 (58054, 0) 
64 (52551, 0) 
65 (58054, 0) 
66 (58488, 0) 
67 (58488, 0) 
68 (58028, 0) 
69 (58464, 0) 
70 (58028, 0) 
71 (57989, 0) 
72 (58595, 0) 
73 (58027, 0) 
74 (57989, 0) 
75 (58595, 0) 
76 (58595, 0) 
77 (58019, 0) 
78 (58595, 0) 
79 (58595, 0) 
80 (58019, 0) 
81 (58595, 0) 
82 (58595, 0) 
83 (66715, 0) 
84 (58595, 0) 
85 (59295, 0) 
86 (67614, 0) 
87 (58314, 0) 
88 (58314, 0) 
Name: VKCSKEY1, Length: 89 
VKCSKEY1 
(60744, 0) (60744, 0) 
(60750, 0) (60750, 0) 
(60768, 0) (60768, 0) 
(60758, 0) (60758, 0) 
(69223, 0) (69223, 0) 
(64265, 0) (64265, 0) 
(64216, 0) (64216, 0) 
(57085, 0) (57085, 0) 
(57014, 0) (57014, 0) 
(57033, 0) (57033, 0) 
(64065, 0) (64065, 0) 
(57070, 0) (57070, 0) 
(64062, 0) (64062, 0) 
(64061, 0) (64061, 0) 
(59849, 0) (59849, 0) 
(59415, 0) (59415, 0) 
(58487, 0) (58487, 0) 
(58054, 0) (58054, 0) 
(52551, 0) (52551, 0) 
(58488, 0) (58488, 0) 
(58028, 0) (58028, 0) 
(58464, 0) (58464, 0) 
(57989, 0) (57989, 0) 
(58595, 0) (58595, 0) 
(58027, 0) (58027, 0) 
(58019, 0) (58019, 0) 
(66715, 0) (66715, 0) 
(59295, 0) (59295, 0) 
(67614, 0) (67614, 0) 
(58314, 0) (58314, 0) 
Name: traf_key 
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88], dtype=int64) 
[] 
Index([(60744, 0), (60750, 0), (60768, 0), (60758, 0), (69223, 0), (64265, 0), (64216, 0), (57085, 0), (57014, 0), (57033, 0), (64065, 0), (57070, 0), (64062, 0), (64061, 0), (59849, 0), (59415, 0), (58487, 0), (58054, 0), (52551, 0), (58488, 0), (58028, 0), (58464, 0), (57989, 0), (58595, 0), (58027, 0), (58019, 0), (66715, 0), (59295, 0), (67614, 0), (58314, 0)], dtype=object) 
[] 
(60744, 0) True 
(60750, 0) True 
(60768, 0) True 
(60758, 0) True 
(69223, 0) True 
(64265, 0) True 
(64216, 0) True 
(57085, 0) True 
(57014, 0) True 
(57033, 0) True 
(64065, 0) True 
(57070, 0) True 
(64062, 0) True 
(64061, 0) True 
(59849, 0) True 
(59415, 0) True 
(58487, 0) True 
(58054, 0) True 
(52551, 0) True 
(58488, 0) True 
(58028, 0) True 
(58464, 0) True 
(57989, 0) True 
(58595, 0) True 
(58027, 0) True 
(58019, 0) True 
(66715, 0) True 
(59295, 0) True 
(67614, 0) True 
(58314, 0) True 
Traceback (most recent call last): 
    File "L:\temp\pandas_join_bug.py", line 43, in <module> 
    right_on ='traf_key', 
    File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 36, in merge 
    return op.get_result() 
    File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 185, in get_result 
    ldata, rdata = self._get_merge_data() 
    File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 277, in _get_merge_data 
    copydata=False) 
    File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1194, in _maybe_rename_join 
    to_rename = self.items.intersection(other.items) 
    File "C:\Python27\lib\site-packages\pandas\core\index.py", line 666, in intersection 
    indexer = self.get_indexer(other.values) 
    File "C:\Python27\lib\site-packages\pandas\core\index.py", line 812, in get_indexer 
    raise Exception('Reindexing only valid with uniquely valued Index ' 
Exception: Reindexing only valid with uniquely valued Index objects 

一旦出現錯誤,我不能得到任何合併或連接語句是成功的。起初我沒有看到錯誤與重複的合併/連接操作相關聯。現在最新設置的任何合併/加入都可以工作。一旦我需要另一個合併/加入,我會得到同樣的錯誤。現在掙扎數日...

幫助!!!

呂克

+0

安置自己的數據和代碼,請。 – HYRY 2013-03-12 12:59:13

+0

發生奇怪的事情。如果系列數據是數字的,代碼的作品,如果它是一個元組或字符串,它會失敗... – user1708646 2013-03-13 19:32:51

+0

fyi,你的打印只是顯示numpy 1.6.2如何表示日期,你在做什麼? – Jeff 2013-03-14 14:48:27

回答

9

重複的列名會導致此錯誤,請嘗試消除重複的列名

+0

從上面的評論中無恥地被盜,以便在新手尋找它的地方得到答案。我有這個確切的問題,幾乎錯過了答案 – 2013-08-29 20:12:17

相關問題