2016-10-09 89 views
-1

我有兩個csv文件。獲取Python中的特定行

一個是如下:

"CONS_NO","DATA_DATE","KWH_READING","KWH_READING1","KWH" 
"1652714033","2015/1/12","4747.3800","4736.8000","10.5800" 
"3332440062","2015/1/12","408.6800","407.8200","0.8600" 
"7804314033","2015/1/12","1794.3500","1792.5000","1.8500" 
"0114314033","2015/1/12","3525.2000","3519.4400","5.7600" 
"1742440062","2015/1/12","3097.1900","3091.4100","5.7800" 
"8230100023","2015/1/12","1035.0500","1026.8400","8.2100" 

大約六百萬行所有。

另一種是如下:

6360609057 
8771218657 
1338004100 
2500009393 
9184968250 
9710581700 
8833903141 

在所有大約10 1000行。

第二個csv文件只有CONS_NO。我想查找第一個csv文件中對應於第二個csv文件中的數字的行;並刪除Python中第一個csv文件中的其他行。

+2

到目前爲止你做了什麼? –

+0

熊貓支持[加入兩個數據框](http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging)。嘗試自行解決問題,如果遇到困難,請使用一些代碼編輯問題。 –

+0

謝謝,讓我試試看。 –

回答

0

您可以使用pandas中的合併方法合併兩個DataFrame

我改變你的示例數據爲以下內容:

test1.csv是:

"CONS_NO","DATA_DATE","KWH_READING","KWH_READING1","KWH" 
"1652714033","2015/1/12","4747.3800","4736.8000","10.5800" 
"3332440062","2015/1/12","408.6800","407.8200","0.8600" 
"7804314033","2015/1/12","1794.3500","1792.5000","1.8500" 
"8833903141","2015/1/12","3525.2000","3519.4400","5.7600" 
"1742440062","2015/1/12","3097.1900","3091.4100","5.7800" 
"8833903141","2015/1/12","1035.0500","1026.8400","8.2100" 

'test2.csv」是:

6360609057 
8771218657 
1338004100 
2500009393 
9184968250 
9710581700 
8833903141 

您現在可以使用下面的代碼將它們合併:

import pandas as pd 

df1 = pd.read_csv('test1.csv') 
df2 = pd.read_csv('test2.csv', names=['CONS_NO']) 

pd.merge(df1, df2, on='CONS_NO') 

它給出了以下輸出:

CONS_NO  DATA_DATE KWH_READING KWH_READING1 KWH 
0 8833903141 2015/1/12 3525.20  3519.44   5.76 
1 8833903141 2015/1/12 1035.05  1026.84   8.21