Perl中使用MySQL，非常慢，如何加快

unit 
id fir_name sec_name 
author 
id name unit_id 
author_paper 
id author_id paper_id

我想統一作者[「同一作者」是指名稱相同，且其單位的fir_names是相同的]，我不得不改變author_paper表中同一時間。Perl中使用MySQL，非常慢，如何加快

這裏是我做的：

$conn->do('create index author_name on author (name)'); 
my $sqr = $conn->prepare("select name from author group by name having count(*) > 1"); 
$sqr->execute(); 
while(my @row = $sqr->fetchrow_array()) { 
    my $dup_name = $row[0]; 
    $dup_name = formatHtml($dup_name); 
    my $sqr2 = $conn->prepare("select id, unit_id from author where name = '$dup_name'"); 
    $sqr2->execute(); 

    my %fir_name_hash =(); 
    while(my @row2 = $sqr2->fetchrow_array()) { 
     my $author_id = $row2[0]; 
     my $unit_id = $row2[1]; 
     my $fir_name = getFirNameInUnit($conn, $unit_id); 
     if (not exists $fir_name_hash{$fir_name}) { 
      $fir_name_hash{$fir_name} = []; #anonymous arr reference 
     } 
     $x = $fir_name_hash{$fir_name}; 
     push @$x, $author_id; 
    } 

    while(my ($fir_name, $author_id_arr) = each(%fir_name_hash)) { 
     my $count = scalar @$author_id_arr; 
     if ($count == 1) {next;} 
     my $author_id = $author_id_arr->[0]; 
     for ($i = 1; $i < $count; $i++) { 
      #print "$author_id_arr->[$i] => $author_id\n"; 
      unifyAuthorAndAuthorPaperTable($conn, $author_id, $author_id_arr->[$i]); #just delete in author table, and update in author_paper table 
     } 
    } 
}

SELECT COUNT（*）的作者; ＃240,000 來自作者的select count（distinct（name））; ＃7,7000 速度非常慢！我跑了5小時，它只是刪除了大約4,0000 dup的名字。如何使它運行更快。我渴望您的建議

來源

2012-03-03 lhdgriver

在非空表上創建索引可能需要一些時間。 240k行不是大桌子。 – Kamil 2012-03-03 16:17:00

[perl與mysql的可能重複，非常慢，如何解決它]（http://stackoverflow.com/questions/9533333/perl-with-mysql-terribly-slow-how-to-fix-it） – Toto 2012-03-03 16:18:38

你不應該在循環中準備第二sql語句，當你使用?佔位符，你可以實際使用的準備：

$conn->do('create index author_name on author (name)'); 

my $sqr = $conn->prepare('select name from author group by name having count(*) > 1'); 

# ? is the placeholder and the database driver knows if its an integer or a string and 
# quotes the input if needed. 
my $sqr2 = $conn->prepare('select id, unit_id from author where name = ?'); 

$sqr->execute(); 
while(my @row = $sqr->fetchrow_array()) { 
    my $dup_name = $row[0]; 
    $dup_name = formatHtml($dup_name); 

    # Now you can reuse the prepared handle with different input 
    $sqr2->execute($dup_name); 

    my %fir_name_hash =(); 
    while(my @row2 = $sqr2->fetchrow_array()) { 
     my $author_id = $row2[0]; 
     my $unit_id = $row2[1]; 
     my $fir_name = getFirNameInUnit($conn, $unit_id); 
     if (not exists $fir_name_hash{$fir_name}) { 
      $fir_name_hash{$fir_name} = []; #anonymous arr reference 
     } 
     $x = $fir_name_hash{$fir_name}; 
     push @$x, $author_id; 
    } 

    while(my ($fir_name, $author_id_arr) = each(%fir_name_hash)) { 
     my $count = scalar @$author_id_arr; 
     if ($count == 1) {next;} 
     my $author_id = $author_id_arr->[0]; 
     for ($i = 1; $i < $count; $i++) { 
      #print "$author_id_arr->[$i] => $author_id\n"; 
      unifyAuthorAndAuthorPaperTable($conn, $author_id, $author_id_arr->[$i]); #just delete in author table, and update in author_paper table 
     } 
    } 
}

這應該加快東西爲好。

來源

2012-03-03 16:12:22 dgw

這樣做的速度很快。另請參見[鏈接]（http://www.mysqlfaqs.net/mysql-faqs/General-Questions/What-is-prepared-statement-in-MySQL） – lhdgriver 2012-03-04 17:19:29

當我看到一個查詢和一個循環時，我認爲你有一個延遲問題：你查詢得到一組值，然後遍歷集做別的事。如果這意味着集合中每行的網絡往返數據庫，那麼這是很多延遲。

如果您可以在使用UPDATE和子選擇的單個查詢中執行操作，或者可以批量處理這些請求並在一次往返中執行所有操作，那麼會更好。

如果你明智地使用索引，你會得到額外的加速。 WHERE子句中的每一列都應該有一個索引。每個外鍵都應該有一個索引。

我會在您的查詢上運行EXPLAIN PLAN並查看是否有任何TABLE SCAN正在進行。如果有，你必須正確索引。

我想知道設計正確的JOIN會不會幫助您解決問題？

240,000行在一個表中77000在另一個不是那大的數據庫。

來源

2012-03-03 14:26:19 duffymo

你是對的。我也不喜歡那個循環。他沒有使用SQL，也許不知道如何編寫複雜的查詢。 – Kamil 2012-03-03 16:22:38

Perl中使用MySQL，非常慢，如何加快

回答

相關問題