使用列數據源如何獲得一行的索引？

我使用從CSV文件填充的熊貓數據框，然後使用Bokeh將該數據框轉換爲ColumnDataSource。使用列數據源如何獲得一行的索引？

它看起來像：

dataFrame = pandas.read_csv('somefile.CSV') 
source = ColumnDataSource(dataFrame)

現在，我有我的所有列，我想這樣做基於行的計算。

例如：我有三列：

1, 2, blue 
2, 5, red 
1, 8, yellow

現在，我要換一些相關的變量，該行中，當我通過搜索：

x, y, colour

它可能與填充源，我怎麼能做到這一點：

# how do i step through the source dictionary? 
if source['colour'] == 'blue': 
    # how do I get the current index, which is the row number 
    # how do I change the x column value at the index(row) we retrieved 
    source['x' index] = 2

謝謝

來源

2017-10-10 cyclops

如果你通過數據進行迭代，你可以這樣來做：

dataFrame = pandas.read_csv('somefile.csv') 
source = ColumnDataSource(dataFrame) 

for index, colour in enumerate(source.data['colour']): 
    if colour == 'blue': 
     source.data['x'][index] = 2

另外，爲了避免在整個ColumnDataSource迭代，你可以使用此得到'blue'的第一個值的指數在'colour'列：

list(source.data['colour']).index('blue')

您可以使用它作爲索引用於編輯列x這樣的：

source.data['x'][list(source.data['colour']).index('blue')] = 2

通過這種方式爲此列表編制索引只會爲您提供值爲'blue'的第一個索引。如果你在你的ColumnDataSource的'blue'出現了多次針對其相關'x'值應進行編輯，你應該能夠通過索引的'blue'最後一個索引之後開始只是名單通過'colour'列迭代：

list(source.data['colour'])[last_index+1:].index('blue')

當它正在搜索的列表不包含值'blue'時，它所在的循環應該被包裝在一個試驗語句中，因爲index('blue')會拋出ValueError。

來源

2017-10-10 14:58:04

謝謝，我發現我可以得到dataFrame.iloc [index]以及返回行！如果有人知道更快的方法來做到這一點，請讓我知道，因爲上述方法和我的方法在大型數據集上都非常慢。 – cyclops

如果您最初只需進行更改（在進入ColumnDataSource之前），那麼在pandas dataFrame中進行這些更改幾乎肯定會是最快的。你也可以這樣做：'df ['x']。loc [（df ['color'] =='blue'）] = 2'。即使在大型數據集上，這應該仍然非常快。 –

使用

source.x[source.color == 'blue'] = 2

source.x是要改變，括號內的條件只選擇它是真實的行系列。

來源

2017-10-10 16:27:36 MarianD

使用列數據源如何獲得一行的索引？

回答

相關問題