我正在寫一個腳本來SELECT查詢數據庫和解析通過〜33,000記錄。不幸的是,我遇到了問題,在cursor.fetchone()
/cursor.fetchall()
階段的事情。python緩慢fetchone,掛在fetchall
我第一次嘗試通過光標每次像這樣遍歷一個記錄:
# Run through every record, extract the kanji, then query for FK and weight
printStatus("Starting weight calculations")
while True:
# Get the next row in the cursor
row = cursor.fetchone()
if row == None:
break
# TODO: Determine if there's any kanji in row[2]
weight = float((row[3] + row[4]))/2
printStatus("Weight: " + str(weight))
基於對printStatus
輸出(它打印出一個時間戳加上任何字符串傳遞給它),腳本了大約1秒來處理每一行。這導致我相信,每次迭代循環(使用LIMIT 1或某事)時,查詢都會重新運行,因爲相同的查詢在SQLiteStudio [i]和[/我]返回所有33,000行。我計算出,以這樣的速度,通過所有33,000條記錄需要大約7個小時。
而是坐在通過的,我試圖用cursor.fetchall()代替:
results = cursor.fetchall()
# Run through every record, extract the kanji, then query for FK and weight
printStatus("Starting weight calculations")
for row in results:
# TODO: Determine if there's any kanji in row[2]
weight = float((row[3] + row[4]))/2
printStatus("Weight: " + str(weight))
不幸的是,Python的可執行文件在25%的CPU鎖起來了〜內存6MB,當它到了cursor.fetchall()
線。我離開腳本運行了大約10分鐘,但沒有發生任何事情。
是否有〜33,000個返回的行(大約5MB的數據)太多以至於Python無法一次抓取?我堅持每次迭代一個?還是有什麼我可以做的,以加快速度?
編輯:下面是一些控制檯輸出
12:56:26.019: Adding new column 'weight' and related index to r_ele
12:56:26.019: Querying database
12:56:28.079: Starting weight calculations
12:56:28.079: Weight: 1.0
12:56:28.079: Weight: 0.5
12:56:28.080: Weight: 0.5
12:56:28.338: Weight: 1.0
12:56:28.339: Weight: 3.0
12:56:28.843: Weight: 1.5
12:56:28.844: Weight: 1.0
12:56:28.844: Weight: 0.5
12:56:28.844: Weight: 0.5
12:56:28.845: Weight: 0.5
12:56:29.351: Weight: 0.5
12:56:29.855: Weight: 0.5
12:56:29.856: Weight: 1.0
12:56:30.371: Weight: 0.5
12:56:30.885: Weight: 0.5
12:56:31.146: Weight: 0.5
12:56:31.650: Weight: 1.0
12:56:32.432: Weight: 0.5
12:56:32.951: Weight: 0.5
12:56:32.951: Weight: 0.5
12:56:32.952: Weight: 1.0
12:56:33.454: Weight: 0.5
12:56:33.455: Weight: 0.5
12:56:33.455: Weight: 1.0
12:56:33.716: Weight: 0.5
12:56:33.716: Weight: 1.0
而這裏的SQL查詢:
0 0 0 SCAN TABLE r_ele AS re USING COVERING INDEX r_ele_fk (~500000 rows)
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1
1 0 0 SEARCH TABLE re_pri USING INDEX re_pri_fk (fk=?) (~10 rows)
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 2
2 0 0 SEARCH TABLE ke_pri USING INDEX ke_pri_fk (fk=?) (~10 rows)
2 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 3
3 0 0 SEARCH TABLE k_ele USING AUTOMATIC COVERING INDEX (value=?) (~7 rows)
3 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 4
4 0 0 SEARCH TABLE k_ele USING COVERING INDEX idx_k_ele (fk=?) (~10 rows)
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 5
5 0 0 SEARCH TABLE k_ele USING COVERING INDEX idx_k_ele (fk=?) (~10 rows)
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 6
6 0 0 SEARCH TABLE re_pri USING INDEX re_pri_fk (fk=?) (~10 rows)
0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 7
7 0 0 SEARCH TABLE ke_pri USING INDEX ke_pri_fk (fk=?) (~10 rows)
7 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 8
8 0 0 SEARCH TABLE k_ele USING AUTOMATIC COVERING INDEX (value=?) (~7 rows)
8 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 9
9 0 0 SEARCH TABLE k_ele USING COVERING INDEX idx_k_ele (fk=?) (~10 rows)
您是否嘗試過遍歷遊標:'遊標中的行:...'? –
'fetchone'(或迭代)不會導致它每次都重新運行查詢。 「遊標」對象通常甚至不知道它運行的查詢。所以,不管你的問題是什麼,那不是。 – abarnert
另外,作爲一個方面說明:使用'如果行是None:',而不是'if row == None:'。在大多數情況下,它並沒有真正的區別,但它更具慣用性(它也更快一些,在極少數情況下,當它做出改變時它會成爲你想要的)。 – abarnert