更快的方式做MySQL查詢在python

有列表1和列表2，每個包含1104824個值更快的方式做MySQL查詢在python

table1中有3.5億行3列：ID，1，名稱

，這就是我試圖做：

con = mdb.connect('localhost','user','password','db') 
cur = con.cursor() 
for i in range(1104824) 
    sql ="select count(distinct(a.ID)) from (select name1 ,ID from table1 where name2 <> '"+str(list1[i])+"') as a where a.name1 = '"+str(list2[i])+"'" 
    cur.execute(sql) 
    data = cur.fetchone()[0]

但它是非常非常慢。有沒有更快的方法來做這個查詢？

來源

2016-05-10 dPdms

發佈表結構以及你正在嘗試做什麼。當然，可能有一種方法不涉及110萬個查詢？ – e4c5

如果'ID'是'PRIMARY KEY'，那麼你可以將'COUNT（DISTINCT ID）'改爲'COUNT（*）'。如果'name1，name2'是唯一的，你可能會擺脫'ID'。 –

這是您的查詢：

select count(distinct a.ID) 
from (select name1, ID 
     from table1 
     where name2 <> '"+str(list1[i])+"' 
    ) a 
where a.name1 = '"+str(list2[i])+"'";

我會建議寫這篇爲：

select count(distinct ID) 
from table1 
where name2 <> '"+str(list1[i])+"' and 
     name1 = '"+str(list2[i])+"'";

然後你就可以在table1(name1, name2, id)加快與索引的查詢 - 該順序包含所有三列。

注：我會寫的sql爲：

sql = """ 
select count(distinct ID) 
from table1 
where name2 <> '{0}' and name1 = '{1}' 
""".format(str(list1[i]), str(list2[i]))

來源

2016-05-10 02:02:09

看起來這將與相應的指標以及工作：

select count(distinct id) 
from table1 
where name2 <> 'Name1' 
    and name1 = 'Name2'

考慮使用參數化查詢，但。你的查詢很容易受到sql注入的攻擊，並且會因爲帶有撇號的名字而中斷，例如...有很多示例，這裏有一對夫婦：Python MySQL Parameterized Queries和https://stackoverflow.com/a/1633589/1073631

來源

2016-05-10 02:02:35 sgeddes

更快的方式做MySQL查詢在python

回答

相關問題