我想知道熊貓數據框中的哪些列有不連續的數據。所謂「不連續」,我的意思是,在再次獲得一些價值之前,這些值從某個值變爲零。熊貓的方式來查找不連續的數據
[0,0,0,1,2,3,4,5,0,0,0] # continuous
[0,0,0,1,2,0,4,5,0,0,0] # not continuous
我已經設法實現了一些代碼,可以做到這一點,使用循環遍歷數據幀的每一列。我做了以下工作片段來說明我的意思:
import numpy as np
import pandas as pd
def find_discontinuous(series):
switch = 0
for index,val in series.iteritems():
# print(val, end=" ")
if switch==0 and val==0:
# print("still zero")
continue
elif switch==0 and val!=0:
switch = 1
if switch==1 and val==0:
# print("back to zero")
switch = 2
continue
if switch==2 and val!=0:
# print("supposed to be zero")
return "not continuous"
return "continuous"
data = np.array([[0,1,2,3,4,5,0],
[0,1,2,0,4,5,0]])
df = pd.DataFrame(data,columns=list(range(7)),index=list(range(2))).transpose()
for column in df.columns:
series = df.loc[:,column]
res = find_discontinuous(series)
print(column,res)
輸出:
0 continuous
1 not continuous
我讀的地方,它可能是不正確的使用for循環通過熊貓數據幀,因爲它遍歷是慢的。什麼是熊貓的方式來實現同樣的事情?
那麼,什麼不是不連續的,被認爲是連續的?像所有的零都會連續? – Divakar