2014-03-04 68 views
0

我有一個包含具有字符串數據類型的列的熊貓表。我需要的是從數據框中排除任何具有「未找到」的行作爲其中的字符串。我目前正在:基於字符串值排除熊貓行

DF [df.some_column = 「未找到」!],但不工作

期待回覆。

的樣本數據:

card_number effective_date expiry_date grouping_name  Ac. Year code 
0  1206090 28 Sep 2012 21 Aug 2013 Dummy no.1 201213 
1  1206090 21 Feb 2013 21 Aug 2013 Dummy no.2 201213 
2  1206090 28 Sep 2012 30 Nov 2012 Dummy no.3 201213 
3  1206090 03 Dec 2012 21 Aug 2013 Dummy no.3 201213 
4  1206090 23 Apr 2013 31 Aug 2013 Dummy no.4 201213 
5  1206090 28 Sep 2012 21 Aug 2013 Dummy no.5 201213 
6  1206090 28 Sep 2012 21 Aug 2013 Dummy no.6 201213 
7  1206090 24 Oct 2012 07 Aug 2013  Not found 201213 
8  1206090 08 Jan 2013 08 Jan 2013  Not found 201213 
9  1206090 08 Jan 2013 31 Aug 2013  Not found 201213 
10 Not found 03 Jul 2013 21 Aug 2013 Dummy no.1 201213 
11 Not found 03 Jul 2013 21 Aug 2013 Dummy no.2 201213 

額外注:我的字符串匹配必須非常怪異......當DF [grouping_name]運行=「未找到」我真得爲7,8,9 .. 。有誰知道爲什麼?

+0

你需要使用'str.contains';請參閱[這裏](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.strings.StringMethods.contains.html#pandas.core.strings.StringMethods.contains);如'df.some_column.str.contains('Not Found',na = False,regex = False)' –

+0

TypeError:contains()得到了一個意想不到的關鍵字參數'regex'...除此之外,只會找到列值...我想要列沒有這樣的值 –

+0

添加'〜'到開頭並且放下'regex = False';我想這是添加在'13.1'; –

回答

1

嘗試:

df[df['some_column'] != "Not found"] 

解決方案通過樣本數據:

df = pd.read_csv("data.csv") 
df 

    card_number effective_date expiry_date grouping_name Ac. Year code 
0 1206090  28 Sep 2012  21 Aug 2013  Dummy no.1 201213 
1 1206090  21 Feb 2013  21 Aug 2013  Dummy no.2 201213 
2 1206090  28 Sep 2012  30 Nov 2012  Dummy no.3 201213 
3 1206090  03 Dec 2012  21 Aug 2013  Dummy no.3 201213 
4 1206090  23 Apr 2013  31 Aug 2013  Dummy no.4 201213 
5 1206090  28 Sep 2012  21 Aug 2013  Dummy no.5 201213 
6 1206090  28 Sep 2012  21 Aug 2013  Dummy no.6 201213 
7 1206090  24 Oct 2012  07 Aug 2013  Not found 201213 
8 1206090  08 Jan 2013  08 Jan 2013  Not found 201213 
9 1206090  08 Jan 2013  31 Aug 2013  Not found 201213 
10 Not found 03 Jul 2013  21 Aug 2013  Dummy no.1 201213 
11 Not found 03 Jul 2013  21 Aug 2013  Dummy no.2 201213 


df[df['grouping_name'] != 'Not found'] 

card_number effective_date expiry_date grouping_name Ac. Year code 
0 1206090  28 Sep 2012  21 Aug 2013  Dummy no.1 201213 
1 1206090  21 Feb 2013  21 Aug 2013  Dummy no.2 201213 
2 1206090  28 Sep 2012  30 Nov 2012  Dummy no.3 201213 
3 1206090  03 Dec 2012  21 Aug 2013  Dummy no.3 201213 
4 1206090  23 Apr 2013  31 Aug 2013  Dummy no.4 201213 
5 1206090  28 Sep 2012  21 Aug 2013  Dummy no.5 201213 
6 1206090  28 Sep 2012  21 Aug 2013  Dummy no.6 201213 
10 Not found 03 Jul 2013  21 Aug 2013  Dummy no.1 201213 
11 Not found 03 Jul 2013  21 Aug 2013  Dummy no.2 201213 
+0

不幸運 –

+0

你能提供樣品數據嗎? – Amit

+0

你去了哪裏... grouping_name不能沒有找到它。 –