我一直在嘗試學習使用python更新數據庫的SQL,並試圖做一些簡單的事情。使用收入信息遍歷包含財富500的csv文件並推入SQL數據庫。我已經運行了它幾次,它運行得很好,唯一的問題是我得到重複,因爲我幾次運行相同的文件。學習MySQL,Python - 跳過重複
未來,我假設學習如何避免重複是很好的。環顧四周後,我發現了一個使用WHERE NOT EXISTS的解決方案,但遇到錯誤。歡迎任何建議,因爲我是全新的。
注 - 我知道我應該在同一時間內更新多行,這是我的下一課
import pymysql
import csv
with open('companies.csv','rU') as f:
reader = csv.DictReader(f)
for i in reader:
conn = pymysql.connect(host='host', user='user', passwd='pw', db='db_test')
cur = conn.cursor()
query1 = "INSERT INTO companies (Name, Revenue, Profit, Stock_Price) VALUES (\'{}\',{},{},{})".format(str(i['Standard']),float(i['Revenues']),float(i['Profits']),float(i['Rank']))
query2 = 'WHERE NOT EXISTS (SELECT Name FROM companies WHERE Name = \'{}\')'.format(str(i['Standard']))
query = query1+' '+query2
cur.execute(query)
conn.commit()
cur.close()
OUTPUT:
INSERT INTO companies (Name, Revenue, Profit, Stock_Price) VALUES ('WalMart Stores',469.2,16999.0,1.0) WHERE NOT EXISTS (SELECT Name FROM companies WHERE Name = 'WalMart Stores')
錯誤:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE NOT EXISTS (SELECT Name FROM companies WHERE Name = 'WalMart Stores')' at line 1")
非常感謝。這是有道理的,我需要創建一個獨特的索引,我應該想到這一點。從性能的角度來看,哪個建議是首選?看起來選項2需要更多的調用數據庫,但我肯定是錯的。再次感謝! – kmomo 2014-10-11 16:51:23
從純粹的性能角度來看,唯一索引似乎更有效率,因爲它委託數據庫服務器的責任。但是,爲了培訓的目的,我建議你先嚐試選項2,然後再嘗試選項1. – Barranka 2014-10-11 16:52:56
順便說一下,如果你覺得這個答案有用,請對它進行投票和/或接受它;)(哦,你已經投票表決) – Barranka 2014-10-11 16:53:17