我必須執行大量的插入操作(在本例中爲27k),我希望找到一個最佳的操作。現在這是我有的代碼。正如你所看到的,我正在使用預先準備好的語句和批處理,而且我正在執行每1000個(我也嘗試過使用較少的數字,例如10和100,但時間又長了)。這是從查詢忽略的一件事是,有一個自動生成的ID,如果它是任何事情的問題:使用JDBC和MySQL優化大量插入操作
private void parseIndividualReads(String file, DBAccessor db) {
BufferedReader reader;
try {
Connection con = db.getCon();
PreparedStatement statement = null;
statement = con.prepareStatement("INSERT INTO `vgsan01_process_log`.`contigs_and_large_singletons` (`seq_id` ,`length` ,`ws_id` ,`num_of_reads`) VALUES (?, ?, ?, ?)");
long count = 0;
reader = new BufferedReader(new FileReader(logDir + "/" + file));
String line;
while ((line = reader.readLine()) != null) {
if(count != 0 && count % 1000 == 0)
statement.executeBatch();
if (line.startsWith(">")) {
count++;
String res[] = parseHeader(line);
statement.setString(1, res[0]);
statement.setInt(2, Integer.parseInt(res[1]));
statement.setInt(3, id);
statement.setInt(4, -1);
statement.addBatch();
}
}
statement.executeBatch();
} catch (FileNotFoundException ex) {
Logger.getLogger(VelvetStats.class.getName()).log(Level.SEVERE, "Error opening file: " + file, ex);
} catch (IOException ex) {
Logger.getLogger(VelvetStats.class.getName()).log(Level.SEVERE, "Error reading from file: " + file, ex);
} catch (SQLException ex) {
Logger.getLogger(VelvetStats.class.getName()).log(Level.SEVERE, "Error inserting individual statistics " + file, ex);
}
}
任何其他提示說明哪些可能爲了加快這一進程而改變。我的意思是單個插入語句並沒有太多的信息 - 我說沒有超過50個字符的所有4列
編輯:
好以下給出我已經重構方法如下意見。加速是巨大的。您甚至可以嘗試使用可能產生更好結果的1000值進行遊戲:
private void parseIndividualReads(String file, DBAccessor db) {
BufferedReader reader;
PrintWriter writer;
try {
Connection con = db.getCon();
con.setAutoCommit(false);
Statement st = con.createStatement();
StringBuilder sb = new StringBuilder(10000);
reader = new BufferedReader(new FileReader(logDir + "/" + file));
writer = new PrintWriter(new BufferedWriter(new FileWriter(logDir + "/velvet-temp-contigs", true)), true);
String line;
long count = 0;
while ((line = reader.readLine()) != null) {
if (count != 0 && count % 1000 == 0) {
sb.deleteCharAt(sb.length() - 1);
st.executeUpdate("INSERT INTO `vgsan01_process_log`.`contigs_and_large_singletons` (`seq_id` ,`length` ,`ws_id` ,`num_of_reads`) VALUES " + sb);
sb.delete(0, sb.capacity());
count = 0;
}
//we basically build a giant VALUES(),(),()... string that we use for insert
if (line.startsWith(">")) {
count++;
String res[] = parseHeader(line);
sb.append("('" + res[0] + "','" + res[1] + "','" + id + "','" + "-1'" + "),");
}
}
//insert all the remaining stuff
sb.deleteCharAt(sb.length() - 1);
st.executeUpdate("INSERT INTO `vgsan01_process_log`.`contigs_and_large_singletons` (`seq_id` ,`length` ,`ws_id` ,`num_of_reads`) VALUES " + sb);
con.commit();
} catch (FileNotFoundException ex) {
Logger.getLogger(VelvetStats.class.getName()).log(Level.SEVERE, "Error opening file: " + file, ex);
} catch (IOException ex) {
Logger.getLogger(VelvetStats.class.getName()).log(Level.SEVERE, "Error reading from file: " + file, ex);
} catch (SQLException ex) {
Logger.getLogger(VelvetStats.class.getName()).log(Level.SEVERE, "Error working with mysql", ex);
}
}
除非數據庫和Web服務器在同一臺物理機器,否則不POS解決方案1。對於解決方案2,速度非常好,但是,您還需要照顧SQL注入。 – Dapeng
1是不可行的,因爲執行插入操作的應用程序在單獨的機器上運行,並且修改上傳文件是不可行的。關於你的第二種選擇 - 這不就是準備好的聲明和批處理方法應該做什麼嗎? – LordDoskias
我認爲這是不同的答案在解決方案2中速度要好得多。我要檢查一下。關於其他的東西,你可以設置connection.setAutoCommit(false);然後connection.commit();跑得更快。 – LaGrandMere