pandas df定位只保留第一項

我想獲取另一列的值基於某一列中的值，在同一行。pandas df定位只保留第一項

例如：

業務ID = '123'，我要檢索的BUSINESS_NAME

DF：

biz_id biz_name 
123  chew 
456  bite 
123  chew

代碼：

df['biz_name'].loc[df['biz_id'] == 123]

返回我：

chew 
chew

如何獲得字符串格式的'chew'的1個值？

來源

2017-01-17 jxn

您可以使用iloc或iat爲Series選擇第一個值：

print (df.loc[df['biz_id'] == 123, 'biz_name'].iloc[0]) 
chew

或者：

print (df.loc[df['biz_id'] == 123, 'biz_name'].iat[0]) 
chew

隨着query：

print (df.query('biz_id == 123')['biz_name'].iloc[0]) 
chew

或者在list選擇第一個值或numpy array：

print (df.loc[df['biz_id'] == 123, 'biz_name'].tolist()[0]) 
chew 

print (df.loc[df['biz_id'] == 123, 'biz_name'].values[0]) 
chew

時序：

In [18]: %timeit (df.loc[df['biz_id'] == 123, 'biz_name'].iloc[0]) 
1000 loops, best of 3: 399 µs per loop 

In [19]: %timeit (df.loc[df['biz_id'] == 123, 'biz_name'].iat[0]) 
The slowest run took 4.16 times longer than the fastest. This could mean that an intermediate result is being cached. 
1000 loops, best of 3: 391 µs per loop 

In [20]: %timeit (df.query('biz_id == 123')['biz_name'].iloc[0]) 
The slowest run took 4.39 times longer than the fastest. This could mean that an intermediate result is being cached. 
1000 loops, best of 3: 1.75 ms per loop 

In [21]: %timeit (df.loc[df['biz_id'] == 123, 'biz_name'].tolist()[0]) 
The slowest run took 4.18 times longer than the fastest. This could mean that an intermediate result is being cached. 
1000 loops, best of 3: 384 µs per loop 

In [22]: %timeit (df.loc[df['biz_id'] == 123, 'biz_name'].values[0]) 
The slowest run took 5.32 times longer than the fastest. This could mean that an intermediate result is being cached. 
1000 loops, best of 3: 370 µs per loop 

In [23]: %timeit (df.loc[df.biz_id.eq(123).idxmax(), 'biz_name']) 
1000 loops, best of 3: 517 µs per loop

來源

2017-01-17 06:59:23 jezrael

使用idxmax搶到第一最大值指數

df.loc[df.biz_id.eq(123).idxmax(), 'biz_name'] 

'chew'

來源

2017-01-17 07:05:29 piRSquared

pandas df定位只保留第一項

回答

相關問題