2017-10-05 56 views
0

的元件我有兩個CSV文件,文件1和文件2,其包含不同的信息。這兩個csv文件的第二列都包含一個日期。我想確定文件2中的任何日期是否包含在來自文件1的日期 - 時間對之間。由此,我的意思是包含在來自文件1的兩個連續日期之間。我還有一個額外的約束,我需要第4列中的字段文件1的值爲非零。如何檢查如果日期是雙日期對蟒

import numpy as np 
import csv 
from datetime import datetime,date,timedelta 


def try_parsing_date(text): 

    for fmt in ('%Y-%m-%d %H:%M','%Y-%m-%d %H:%M:%S', '%d/%m/%Y %H:%M:%s', '%d/%m/%Y %H:%M','%d/%m/%Y','%H:%M:%S','%Y-%m-%d-%H%M%S.%f'): 
     try: 
      date_strip = datetime.strptime(text, fmt) 
      return date_strip 
     except ValueError: 
      pass 
    raise ValueError('no valid date format found') 

def append_dates(a,b): 
    date_1_vec = [] 
    date_2_vec = [] 
    with open(a) as file1: 
     reader1 = csv.reader(file1,delimiter = ',') 
     for row in reader: 
      date_1_vec.append(datetime.strptime(row[1], "%Y-%m-%d-%H%M%S")) 
     with open(b) as file2: 
     feed_bin = [] 
     upd_vec = [] 
     nothing = [0] 
     reader2 = csv.reader(file2,delimiter = ',')    

     for row in reader2: 
      temp_date = datetime.strptime(row[1], "%Y-%m-%d %H:%M:%S") 
      temp_date2 = temp_date + timedelta(minutes=15) 
      test_val = float(row[4]) 
      if any( (temp_date < dat for dat in date_1_vec) and (temp_date2 > dat for dat in date_1_vec) and (test_val >nothing for nothing in nothing) ): 
       feed_bin.append(1) 
       val = 1 
       #print("yes") 
      else: 
       feed_bin.append(0) 
       val = 0 
       #print("No") 
      upd = [row[0],row[1],row[2],val] 
      upd_vec.append(upd) 
    np.savetxt("outfile.csv",upd_vec, delimiter=",", fmt='%s') 

def main(): 
    append_dates("file1.csv","file2.csv") 
main() 

我已經嘗試了一些差異

文件1

42 08/06/2017 00:00 1 15 0 
42 08/06/2017 00:15 5 11 75 
42 08/06/2017 00:30 0 15 0 
42 08/06/2017 00:45 85 475 0 
42 08/06/2017 01:00 125 75 0 
42 08/06/2017 01:15 0 0 0 
42 08/06/2017 01:30 95 475 0 
42 08/06/2017 01:45 0 75 2.625 
42 08/06/2017 02:00 0 15 0 
42 08/06/2017 02:15 0 13.5 1.5 
42 08/06/2017 02:30 0 1.29623 3.15814 
42 08/06/2017 02:45 0 7.5 15 
42 08/06/2017 03:00 0 0 15 

文件2

42 2017-06-07-232240 
42 2017-06-08-012636 
42 2017-06-08-013811 
42 2017-06-08-014553 
42 2017-06-08-014751 
42 2017-06-08-101332 
42 2017-06-08-101558 
42 2017-06-08-102707 
42 2017-06-08-104039 
42 2017-06-08-105516 
42 2017-06-08-110620 

最新嘗試但是迄今爲止還沒有成功。我目前的方法存在的問題是(我認爲)條件始終得到滿足,因爲它正在搜索文件1中的所有日期,而不是按照我的要求連續日期。

如何修改我的代碼,或一種全新的方法任何建議,將不勝感激!

後更新Jurgy的建議 - 電流輸出:

2017-06-14 13:51:57 is between 2017-06-14 13:45:00 and 2017-06-14 14:00:00 
2017-06-14 13:57:34 is between 2017-06-14 13:45:00 and 2017-06-14 14:00:00 
2017-06-14 13:51:57 is between 2017-06-14 13:45:00 and 2017-06-14 14:00:00 
2017-06-14 13:57:34 is between 2017-06-14 13:45:00 and 2017-06-14 14:00:00 
2017-06-14 13:51:57 is between 2017-06-14 13:45:00 and 2017-06-14 14:00:00 
2017-06-14 13:57:34 is between 2017-06-14 13:45:00 and 2017-06-14 14:00:00 
2017-06-14 13:51:57 is between 2017-06-14 13:45:00 and 2017-06-14 14:00:00 
2017-06-14 13:57:34 is between 2017-06-14 13:45:00 and 2017-06-14 14:00:00 
2017-06-14 16:42:03 is between 2017-06-14 16:30:00 and 2017-06-14 16:45:00 
2017-06-14 16:42:03 is between 2017-06-14 16:30:00 and 2017-06-14 16:45:00 
2017-06-14 16:42:03 is between 2017-06-14 16:30:00 and 2017-06-14 16:45:00 
2017-06-14 16:42:03 is between 2017-06-14 16:30:00 and 2017-06-14 16:45:00 
2017-06-14 16:42:03 is between 2017-06-14 16:30:00 and 2017-06-14 16:45:00 
2017-06-14 16:42:03 is between 2017-06-14 16:30:00 and 2017-06-14 16:45:00 
2017-06-14 16:42:03 is between 2017-06-14 16:30:00 and 2017-06-14 16:45:00 

回答

1

怎麼樣通過文件1的行,每一行迭代,迭代槽文件2中的行看看這些日子之一是在文件1的最後兩行之間。這可以通過首先提取文件2的所有日期來優化,所以你不必每次打開te文件。如果從第一個文件的日期並不總是consectutive順序,你也可以先檢查是否prev_day < cur_day沒有你有幫助的解決方案打開文件2.

def append_dates(a,b): 
    cur_day, prev_day = None, None 
    with open(a) as file1: 
     for f1row in csv.reader(file1,delimiter = ','): 
      cur_day = datetime.strptime(f1row[1], "%Y-%m-%d-%H%M%S")) 
      if prev_day == None: 
       prev_day = cur_day 
       continue 
      with open(b) as file2: 
       for f2row in csv.reader(file2,delimiter = ','): 
        f2day = datetime.strptime(f2row[1], "%Y-%m-%d %H:%M:%S") 
        if prev_day <= f2day <= cur_day: 
         print("{} is between {} and {}".format(f2day, prev_day, cur_day)) 
      prev_day = cur_day 
+0

謝謝!它幾乎是做我想要的,但目前正在按照正確的條件打印多個時間。我試圖弄清楚爲什麼這是 – Sjoseph

+0

我也加了$打開(b)作爲file2:$後繼續聲明 – Sjoseph

+0

哦,是的,忘記了開放(二)。如果在f1中的兩個日期之間存在多行f2,此解決方案將每f1行打印多次。如果你想選擇第一個,你可以在打印後添加一箇中斷。 – Jurgy