您可以使用list comprehension
,上投True, False
值int
:
df["exist"] = [r[0] in r[1] for r in zip(df["p"], df["p_list"])]
df["exist"] = df["exist"].astype(int)
print (df)
p p_list exist
0 12 [12, 1, 5] 1
1 4 [3, 1] 0
2 5 [8, 9, 11] 0
3 6 [6, 7, 9] 1
4 7 [7, 1, 2] 1
5 7 [12, 9, 8] 0
6 6 [6, 1, 15] 1
7 5 [6, 8, 9, 11] 0
df["exist"] = [int(r[0] in r[1]) for r in zip(df["p"], df["p_list"])]
print (df)
p p_list exist
0 12 [12, 1, 5] 1
1 4 [3, 1] 0
2 5 [8, 9, 11] 0
3 6 [6, 7, 9] 1
4 7 [7, 1, 2] 1
5 7 [12, 9, 8] 0
6 6 [6, 1, 15] 1
7 5 [6, 8, 9, 11] 0
時序:
#[8000 rows x 2 columns]
df = pd.concat([df]*1000).reset_index(drop=True)
print (df)
In [89]: %%timeit
...: df["exist2"] = [r[0] in r[1] for r in zip(df["p"], df["p_list"])]
...: df["exist2"] = df["exist2"].astype(int)
...:
100 loops, best of 3: 6.07 ms per loop
In [90]: %%timeit
...: df["exist"] = [1 if r[0] in r[1] else 0 for r in zip(df["p"], df["p_list"])]
...:
100 loops, best of 3: 7.16 ms per loop
In [91]: %%timeit
...: df["exist"] = [int(r[0] in r[1]) for r in zip(df["p"], df["p_list"])]
...:
100 loops, best of 3: 9.23 ms per loop
In [92]: %%timeit
...: df['exist1'] = df.apply(lambda x: x.p in x.p_list, axis=1).astype(int)
...:
1 loop, best of 3: 370 ms per loop
In [93]: %%timeit
...: df["exist"]= df.apply(lambda r: 1 if r["p"] in r["p_list"] else 0, axis=1)
1 loop, best of 3: 310 ms per loop
難道'isin'用於此?或者'eval('p_list'中的')'? – SethMMorton
@SethMMorton - 我認爲不行,因爲需要按行比較,'eval'對我來說會返回錯誤(不知道如何使用) – jezrael
對不起,我的意思是'df.eval('p_list''p')。這是什麼失敗?這應該是行評估。 – SethMMorton