2015-08-15 21 views
1

組單列我有一個SQLite數據庫,像這樣與列標題和數據:的Python:由不同的價值

expiration, title 
2015-08-15, example title 
2015-08-15, another sample title 
2015-08-15, another one 
2015-08-16, lorem ipsum 
2015-08-16, example 

有沒有辦法在到期日,因此會導致類似組:

Expiring 2015-08-15 
    example title 
    another sample title 
    another one 
Expiring 2015-08-16 
    lorem ipsum 
    example 

目前這裏是我有:

cur.execute("SELECT DISTINCT * FROM expiration WHERE exp BETWEEN date('now','-1 days') AND date('now','+6 days') ORDER BY exp") 
    sql.commit() 
    row = cur.fetchall() 
    for res in row: 
     msg += res[1] + "\n" 
    print msg 

但按日期,它只是LIS不組TS所有的冠軍

回答

1

你絕對可以達到這個分組操作在SQL中,但沒有挖掘到sqlite的細節,這也將是很容易做到的分組在python像這樣:

import itertools as it 

cur.execute("SELECT * FROM expiration WHERE exp BETWEEN date('now','-1 days') AND date('now','+6 days') ORDER BY exp") 
sql.commit() 
row = cur.fetchall() 
for i,g in it.groupby(row, key=lambda x: x[0]): 
    msg += 'Expiring %s%s\n' % (i, '\n\t'.join(x[1] for x in g)) 
print msg 
+0

我得到'( '錯誤:' 類型錯誤( 「'itertools._grouper'對象沒有屬性'__getitem __'」,))' – Bijan

+0

對不起,我沒有測試。我認爲我的編輯應該現在就修復它。 – fivetentaylor

+0

同樣的事情。我評論了循環中的所有內容,但它仍然有相同的錯誤。 – Bijan

1

SQL不輸出導致縮進的分組結構,但是以行和列的表格格式。

考慮Python的數據分析軟件包使用groupby()pandas,它無縫地與源碼工作:

import pandas as pd 
import sqlite3 
import numpy as np 

conn = sqlite3.connect('example.db') 
dataframe = pandas.read_sql("SELECT DISTINCT * FROM expiration \ 
          WHERE exp BETWEEN date('now','-1 days') \ 
          AND date('now','+6 days') \ 
          ORDER BY exp", conn) 

expdategroup = dataframe.groupby(['exp', 'title']) 
print(expdategroup['title'].count()) 

隨着下面的輸出(這裏聚集計數冠軍,在每個EXP日期):

exp   title 
2015-08-15 example title   5 
      another sample title 3 
      another one   6 
      lorem ipsum   4 
      example    2 
2015-08-16 example title   2 
      another sample title 2 
      another one   1 
      lorem ipsum   4 
      example    7 
      ... 

或者,您可以使用長度集合函數,在此仍然使用pandas pivot_table。該解決方案需要另一個變量(很好的機會,數字數據添加到sum()mean()等):

table = pd.pivot_table(df, values='othervar', index=['exp', 'title'], aggfunc=len) 
print(table) 

大致相同的輸出:

exp   title 
2015-08-15 example title   5 
      another sample title 3 
      another one   6 
      lorem ipsum   4 
      example    2