包含整數

的字符串降熊貓據幀行我有列的熊貓數據幀包含整數

[Brand, CPL1, CPL4, Part Number, Calendar Year/Month, value, type]

當他們出來StatsModels X13的，他們occasionaly有一個整數非常大的字符串表示的值，使在任何意義上他們的背景下，EG：

[float(1.2), float(1.3), str("63478"), float(1.1)]

如何刪除發生這種情況的行？由於它們是整數的字符串表示，我不能施放它們或任何類似的方法。

來源

2016-11-04 Jeremy Barnes

什麼是數據的來源是什麼？缺陷列（或列中的行）的起源是什麼？一些特定的樣本數據和/或代碼會有所幫助。 –

來源是SAP Hana xls文件被導入到DataFrame中，將每個零件編號展平成一系列並從statsmodels x13出來。從x13出來的這個系列包含了這些違規行爲。 –

您可以使用boolean indexing以檢查是否type是string：

數據幀：

df = pd.DataFrame([[float(1.2), float(1.3), str("63478"), float(1.1)], 
        [float(1.2), float(1.3), float(1.1), str("63478")]]).T 

print (df) 
     0  1 
0 1.2 1.2 
1 1.3 1.3 
2 63478 1.1 
3 1.1 63478 

print (df.applymap(lambda x: isinstance(x, str))) 
     0  1 
0 False False 
1 False False 
2 True False 
3 False True 

print (df.applymap(lambda x: isinstance(x, str)).any(axis=1)) 
0 False 
1 False 
2  True 
3  True 
dtype: bool 

print (df[~df.applymap(lambda x: isinstance(x, str)).any(axis=1)]) 
    0 1 
0 1.2 1.2 
1 1.3 1.3

系列：

s = pd.Series([float(1.2), float(1.3), str("63478"), float(1.1)]) 
print (s) 
0  1.2 
1  1.3 
2 63478 
3  1.1 
dtype: object 

print (s.apply(lambda x: isinstance(x, str))) 
0 False 
1 False 
2  True 
3 False 
dtype: bool 

print (s[~s.apply(lambda x: isinstance(x, str))]) 
0 1.2 
1 1.3 
3 1.1 
dtype: object

來源

2016-11-04 14:34:47 jezrael

回答

相關問題