我得到一個非常強大的聯接操作錯誤。我試圖合併(left_index,right_index)以及相同的結果。
索引是相同的(按設計),通過兩個索引上的index.is_unique(TRUE)和index.get_duplicates()(EMPTY)進行檢查。
Basic版本:
df1.join(series)
merge(df1, series_as_df,
print tempres.index
[2013年1月14日17時04分45秒,......,2013年1月14日16時53分05秒] 長度:89,頻率:無,時區:無
奇怪的是打印值: 打印tempres.index.val ues [1970-01-16 121:04:45 1970-01-16 121:04:35 1970-01-16 121:04:25 1970-01-16 121:04:15 1970-01-16 121 :04:05 1970-01-16 121:03:55 1970-01-16 121:03:45 1970-01-16 121:03:35 1970-01-16 121:03:25 1970-01- 16 121:03:15 1970-01-16 121:03:05 1970-01-16 121:02:55 1970-01-16 121:02:45 1970-01-16 121:02:35 1970-01 -16 121:02:25 1970-01-16 121:02:15 1970-01-16 121:02:05 1970-01-16 121:01:55 1970-01-16 121:01:45 1970 -01-16 121:01:35 1970年1月16日121:01:25 ...]
如果需要我可以添加酸洗系列和df ...
採用最新版本的熊貓0.10.x
感謝,
呂克
我的代碼(切從更大的代碼)
XYTparams (existing dataframe)
prep_functions[funcname] = [list of values, same length as XYTparams]
iSeries = Series(prep_functions[funcname], index = XYTparams.index, name = funcname)
XYTparams = XYTparams.join(iSeries)
審查我的問題:
我使用合併和在基本DataFrame上連續連接。在嘗試下一次合併/加入時,我開始出現錯誤。我無法在一個簡單的測試中重現這一點,但我在問題開始之前保存了數據框。
我找不到問題所在。
base_df = load('SPOparams.pic')
lookup_df = load('lookup.pic')
print base_df
print lookup_df
print base_df.count()
print base_df['VKCSKEY1']
print lookup_df['traf_key']
# reset index does not change a thing
base_df = base_df.reset_index(drop=True)
print base_df.index
print base_df.index.get_duplicates()
print lookup_df.index
print lookup_df.index.get_duplicates()
# checking value matches
for k in lookup_df['traf_key']:
print k, k in base_df['VKCSKEY1'].values
# why does this merge is unsuccesfull ???
# in any combination of the parameters
df_result =merge(base_df, lookup_df,
how='left',
#how = 'outer',
left_on ='VKCSKEY1',
right_on ='traf_key',
#left_index=True,
#right_index = True,
#sort=True,
#suffixes=('', '.m'), copy=True
)
print df_result
輸出:
1.6.1
0.10.1
<class 'pandas.core.frame.DataFrame'>
Int64Index: 89 entries, 0 to 88
Data columns:
T 89 non-null values
X 89 non-null values
Y 89 non-null values
precip_quantity_1hour 89 non-null values
pressure 89 non-null values
rel_humidity 89 non-null values
temp 89 non-null values
temp_max 0 non-null values
temp_min 0 non-null values
wind_direction 89 non-null values
wind_speed 89 non-null values
BC_TRAF 89 non-null values
closest 89 non-null values
closest.m 89 non-null values
AGGP.P50_ID 89 non-null values
AGGP.FUNC_CLASS 89 non-null values
AGGP.SPEED_CAT 89 non-null values
LINK_ID 89 non-null values
FUNC_CLASS 89 non-null values
SPEED_CAT 89 non-null values
AR_AUTO 89 non-null values
AR_BUS 89 non-null values
AR_TAXIS 89 non-null values
AR_CARPOOL 89 non-null values
AR_PEDEST 89 non-null values
AR_TRUCKS 89 non-null values
STCA20_PCT 89 non-null values
VKC_LINKNR 89 non-null values
TRVIC150R1 89 non-null values
closest.m 89 non-null values
closest.m.m 89 non-null values
VKCP.LINK_ID 89 non-null values
VKCP.FUNC_CLASS 89 non-null values
VKCP.SPEED 89 non-null values
VKCP.LINKNR 89 non-null values
VKCP.TWIN_ID 89 non-null values
VKCSKEY1 89 non-null values
dtypes: datetime64[ns](1), float64(13), int64(9), object(14)
<class 'pandas.core.frame.DataFrame'>
Index: 30 entries, (60744, 0) to (58314, 0)
Data columns:
traf_key 30 non-null values
weekday_nr 30 non-null values
linknr 30 non-null values
weekday 30 non-null values
vr0 30 non-null values
vr1 30 non-null values
vr2 30 non-null values
vr3 30 non-null values
vr4 30 non-null values
vr5 30 non-null values
vr6 30 non-null values
vr7 30 non-null values
vr8 30 non-null values
vr9 30 non-null values
vr10 30 non-null values
vr11 30 non-null values
vr12 30 non-null values
vr13 30 non-null values
vr14 30 non-null values
vr15 30 non-null values
vr16 30 non-null values
vr17 30 non-null values
vr18 30 non-null values
vr19 30 non-null values
vr20 30 non-null values
vr21 30 non-null values
vr22 30 non-null values
vr23 30 non-null values
au0 30 non-null values
au1 30 non-null values
au2 30 non-null values
au3 30 non-null values
au4 30 non-null values
au5 30 non-null values
au6 30 non-null values
au7 30 non-null values
au8 30 non-null values
au9 30 non-null values
au10 30 non-null values
au11 30 non-null values
au12 30 non-null values
au13 30 non-null values
au14 30 non-null values
au15 30 non-null values
au16 30 non-null values
au17 30 non-null values
au18 30 non-null values
au19 30 non-null values
au20 30 non-null values
au21 30 non-null values
au22 30 non-null values
au23 30 non-null values
sn0 30 non-null values
sn1 30 non-null values
sn2 30 non-null values
sn3 30 non-null values
sn4 30 non-null values
sn5 30 non-null values
sn6 30 non-null values
sn7 30 non-null values
sn8 30 non-null values
sn9 30 non-null values
sn10 30 non-null values
sn11 30 non-null values
sn12 30 non-null values
sn13 30 non-null values
sn14 30 non-null values
sn15 30 non-null values
sn16 30 non-null values
sn17 30 non-null values
sn18 30 non-null values
sn19 30 non-null values
sn20 30 non-null values
sn21 30 non-null values
sn22 30 non-null values
sn23 30 non-null values
dtypes: float64(24), int64(50), object(2)
T 89
X 89
Y 89
precip_quantity_1hour 89
pressure 89
rel_humidity 89
temp 89
temp_max 0
temp_min 0
wind_direction 89
wind_speed 89
BC_TRAF 89
closest 89
closest.m 89
AGGP.P50_ID 89
AGGP.FUNC_CLASS 89
AGGP.SPEED_CAT 89
LINK_ID 89
FUNC_CLASS 89
SPEED_CAT 89
AR_AUTO 89
AR_BUS 89
AR_TAXIS 89
AR_CARPOOL 89
AR_PEDEST 89
AR_TRUCKS 89
STCA20_PCT 89
VKC_LINKNR 89
TRVIC150R1 89
closest.m 89
closest.m.m 89
VKCP.LINK_ID 89
VKCP.FUNC_CLASS 89
VKCP.SPEED 89
VKCP.LINKNR 89
VKCP.TWIN_ID 89
VKCSKEY1 89
0 (60744, 0)
1 (60744, 0)
2 (60744, 0)
3 (60750, 0)
4 (60768, 0)
5 (60768, 0)
6 (60758, 0)
7 (60758, 0)
8 (69223, 0)
9 (69223, 0)
10 (69223, 0)
11 (64265, 0)
12 (64265, 0)
13 (64265, 0)
14 (64265, 0)
15 (64265, 0)
16 (64265, 0)
17 (64265, 0)
18 (64265, 0)
19 (64265, 0)
20 (64216, 0)
21 (64216, 0)
22 (64216, 0)
23 (64216, 0)
24 (64216, 0)
25 (64216, 0)
26 (64216, 0)
27 (64216, 0)
28 (64216, 0)
29 (57085, 0)
30 (57085, 0)
31 (57085, 0)
32 (57085, 0)
33 (57085, 0)
34 (57085, 0)
35 (57014, 0)
36 (57033, 0)
37 (57033, 0)
38 (64065, 0)
39 (64065, 0)
40 (64065, 0)
41 (64065, 0)
42 (64065, 0)
43 (57070, 0)
44 (64062, 0)
45 (64062, 0)
46 (64062, 0)
47 (64062, 0)
48 (57070, 0)
49 (64061, 0)
50 (64061, 0)
51 (64061, 0)
52 (64061, 0)
53 (59849, 0)
54 (59415, 0)
55 (58487, 0)
56 (58054, 0)
57 (58054, 0)
58 (58054, 0)
59 (52551, 0)
60 (58054, 0)
61 (58054, 0)
62 (58054, 0)
63 (58054, 0)
64 (52551, 0)
65 (58054, 0)
66 (58488, 0)
67 (58488, 0)
68 (58028, 0)
69 (58464, 0)
70 (58028, 0)
71 (57989, 0)
72 (58595, 0)
73 (58027, 0)
74 (57989, 0)
75 (58595, 0)
76 (58595, 0)
77 (58019, 0)
78 (58595, 0)
79 (58595, 0)
80 (58019, 0)
81 (58595, 0)
82 (58595, 0)
83 (66715, 0)
84 (58595, 0)
85 (59295, 0)
86 (67614, 0)
87 (58314, 0)
88 (58314, 0)
Name: VKCSKEY1, Length: 89
VKCSKEY1
(60744, 0) (60744, 0)
(60750, 0) (60750, 0)
(60768, 0) (60768, 0)
(60758, 0) (60758, 0)
(69223, 0) (69223, 0)
(64265, 0) (64265, 0)
(64216, 0) (64216, 0)
(57085, 0) (57085, 0)
(57014, 0) (57014, 0)
(57033, 0) (57033, 0)
(64065, 0) (64065, 0)
(57070, 0) (57070, 0)
(64062, 0) (64062, 0)
(64061, 0) (64061, 0)
(59849, 0) (59849, 0)
(59415, 0) (59415, 0)
(58487, 0) (58487, 0)
(58054, 0) (58054, 0)
(52551, 0) (52551, 0)
(58488, 0) (58488, 0)
(58028, 0) (58028, 0)
(58464, 0) (58464, 0)
(57989, 0) (57989, 0)
(58595, 0) (58595, 0)
(58027, 0) (58027, 0)
(58019, 0) (58019, 0)
(66715, 0) (66715, 0)
(59295, 0) (59295, 0)
(67614, 0) (67614, 0)
(58314, 0) (58314, 0)
Name: traf_key
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88], dtype=int64)
[]
Index([(60744, 0), (60750, 0), (60768, 0), (60758, 0), (69223, 0), (64265, 0), (64216, 0), (57085, 0), (57014, 0), (57033, 0), (64065, 0), (57070, 0), (64062, 0), (64061, 0), (59849, 0), (59415, 0), (58487, 0), (58054, 0), (52551, 0), (58488, 0), (58028, 0), (58464, 0), (57989, 0), (58595, 0), (58027, 0), (58019, 0), (66715, 0), (59295, 0), (67614, 0), (58314, 0)], dtype=object)
[]
(60744, 0) True
(60750, 0) True
(60768, 0) True
(60758, 0) True
(69223, 0) True
(64265, 0) True
(64216, 0) True
(57085, 0) True
(57014, 0) True
(57033, 0) True
(64065, 0) True
(57070, 0) True
(64062, 0) True
(64061, 0) True
(59849, 0) True
(59415, 0) True
(58487, 0) True
(58054, 0) True
(52551, 0) True
(58488, 0) True
(58028, 0) True
(58464, 0) True
(57989, 0) True
(58595, 0) True
(58027, 0) True
(58019, 0) True
(66715, 0) True
(59295, 0) True
(67614, 0) True
(58314, 0) True
Traceback (most recent call last):
File "L:\temp\pandas_join_bug.py", line 43, in <module>
right_on ='traf_key',
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 36, in merge
return op.get_result()
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 185, in get_result
ldata, rdata = self._get_merge_data()
File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 277, in _get_merge_data
copydata=False)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 1194, in _maybe_rename_join
to_rename = self.items.intersection(other.items)
File "C:\Python27\lib\site-packages\pandas\core\index.py", line 666, in intersection
indexer = self.get_indexer(other.values)
File "C:\Python27\lib\site-packages\pandas\core\index.py", line 812, in get_indexer
raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects
一旦出現錯誤,我不能得到任何合併或連接語句是成功的。起初我沒有看到錯誤與重複的合併/連接操作相關聯。現在最新設置的任何合併/加入都可以工作。一旦我需要另一個合併/加入,我會得到同樣的錯誤。現在掙扎數日...
幫助!!!
呂克
安置自己的數據和代碼,請。 – HYRY 2013-03-12 12:59:13
發生奇怪的事情。如果系列數據是數字的,代碼的作品,如果它是一個元組或字符串,它會失敗... – user1708646 2013-03-13 19:32:51
fyi,你的打印只是顯示numpy 1.6.2如何表示日期,你在做什麼? – Jeff 2013-03-14 14:48:27