2017-10-09 87 views
2

我有一個字典,其中的鍵是datetime.datetime &這些值是推文列表。所以它看起來像這樣:在datetime對象中過濾日期月

{datetime.datetime(2017, 9, 30, 19, 55, 20) : ['this is some tweet text'], 
datetime.datetime(2017, 9, 30, 19, 55, 20) : ['this is another tweet']... 

我試圖得到一年中每個月發出的推文的數量。到目前爲止,我有...

startDate = 10 
endDate= 11 
start = True 
while start: 

    for k,v in tweetDict.items(): 
     endDate-=1 
     startDate-=1 

     datetimeStart = datetime(2017, startDate, 1) 
     datetimeEnd = datetime(2017,endDate, 1) 

     print(datetimeStart, datetimeEnd) 

     if datetimeStart < k < datetimeEnd: 
      print(v) 
     if endDate == 2: 
      start = False 
      break 

只打印(我知道print語句)...

2017-08-01 00:00:00 2017-09-01 00:00:00 
2017-07-01 00:00:00 2017-08-01 00:00:00 
2017-06-01 00:00:00 2017-07-01 00:00:00 
2017-05-01 00:00:00 2017-06-01 00:00:00 
2017-04-01 00:00:00 2017-05-01 00:00:00 
2017-03-01 00:00:00 2017-04-01 00:00:00 
2017-02-01 00:00:00 2017-03-01 00:00:00 
2017-01-01 00:00:00 2017-02-01 00:00:00 

而不是實際的鳴叫自己。我期待着類似...

2017-08-01 00:00:00 2017-09-01 00:00:00 
['heres a tweet'] 
['theres a tweet'] 
2017-07-01 00:00:00 2017-08-01 00:00:00 
['there only 1 tweet for this month'].... 

我有點卡住了,我怎麼能做到這一點?

回答

1

你可以只group by月份,而不是試圖減/比較不同的月份:

>>> d = {datetime.datetime(2017, 9, 30, 19, 55, 20): ['this is some tweet text'], 
     datetime.datetime(2017, 9, 30, 20, 55, 20): ['this is another tweet'], 
     datetime.datetime(2017, 10, 30, 19, 55, 20): ['this is an october tweet'],} 
>>> from itertools import groupby 
>>> for month, group in groupby(d.items(), lambda (k, v): k.month): 
...  print(month) 
...  for dt, tweet in group: 
...   print(dt, tweet) 
...   
10 
2017-10-30 19:55:20 ['this is an october tweet'] 
9 
2017-09-30 19:55:20 ['this is some tweet text'] 
2017-09-30 20:55:20 ['this is another tweet'] 
>>> 

當然,你可以在一個更好的格式打印等(內連接的需要,因爲每個鍵似乎是一個列表):

>>> for month, group in groupby(d.items(), lambda (k, v): k.month): 
...  tweets = list(group) 
...  print("%d tweet(s) in month %d" % (len(tweets), month)) 
...  print('\n'.join(','.join(tweet) for (dt, tweet) in tweets)) 
...  
1 tweet(s) in month 10 
this is an october tweet 
2 tweet(s) in month 9 
this is some tweet text 
this is another tweet 
>>> 
+0

我在這個例子中看到了groupby會更容易,但是我仍然在for循環的第一行中,在'(k,v)'的下面得到'SyntaxError'。我正在使用python 3.這會有所作爲,因爲你的代碼看起來像python 2嗎? – e1v1s

+0

啊,是的,道歉,@ e1v1s將所有'print x'改成'print(x)'(我沒有在這臺機器上安裝python 3)。 – Bahrom

+0

是的,我已經在打印語句中添加了括號。在上面的評論中提到了'Syntax Error' :) – e1v1s

0

第一件事:你把兩個項目在你的字典中完全相同的關鍵。第二個將覆蓋第一個。對於其餘部分,我將假設示例中的第二項略有不同(seconds=21)。

您的代碼無法正常工作的原因是因爲您在for循環內將endDatestartDate遞減。因此,您只能在字典中檢查每個日期對應的單個項目;如果該項目恰好在該月登陸,則會被打印。如果沒有,它不會。爲了說明,這裏是如果你改變你得到你的printprint(datetimeStart, datetimeEnd, k, v)

2017-09-01 00:00:00 2017-10-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text'] 
['this is some tweet text'] 
2017-08-01 00:00:00 2017-09-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet'] 
2017-07-01 00:00:00 2017-08-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text'] 
2017-06-01 00:00:00 2017-07-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet'] 
2017-05-01 00:00:00 2017-06-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text'] 
2017-04-01 00:00:00 2017-05-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet'] 
2017-03-01 00:00:00 2017-04-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text'] 
2017-02-01 00:00:00 2017-03-01 00:00:00 2017-09-30 19:55:21 ['this is another tweet'] 
2017-01-01 00:00:00 2017-02-01 00:00:00 2017-09-30 19:55:20 ['this is some tweet text'] 

用最少的改變現有代碼的解決將是隻需將遞減的for環的前部和迪登的if endDate...塊到while循環的水平:

while start: 
    endDate-=1 
    startDate-=1 
    for k,v in tweetDict.items(): 
     datetimeStart = datetime(2017, startDate, 1) 
     datetimeEnd = datetime(2017,endDate, 1) 
     print(datetimeStart, datetimeEnd, k, v) 
     if datetimeStart < k < datetimeEnd: 
      print(v) 
    if endDate == 2: 
     start = False 
     break 

當然,在這一點上,你可能也只是擺脫if endDate...塊,做while endDate > 2:

相關問題