優化50000行表格的代碼/數據庫

我有一份300條RSS新聞文章的數據庫存儲列表，每隔幾分鐘我就抓取每一個數據源的內容。每個Feed包含大約10篇文章，我希望將每篇文章存儲在數據庫中。優化50000行表格的代碼/數據庫

問題：我的數據庫表超過50,000行並且在迅速增長;每次我運行我的腳本獲取新的提要時，它至少會添加100多行。這是我的數據庫正在達到100％的CPU利用率。

問題：如何優化我的代碼/數據庫？

注：我不關心我的服務器的CPU（這是運行這個時爲< 15％）。我非常關心我的數據庫的CPU。

可能的解決方案我看到：

目前，每個腳本運行時，它會被$ this-> set_content_source_cache它返回陣列（「鏈接」，「鏈接」的數組，「鏈接'等）來自表中的所有行。這用於稍後交叉引用以確保沒有重複鏈接。不會這樣做，只是改變數據庫，所以鏈接列是獨一無二的速度？可能將這個數組放在memcached中，而只需要每天一小時創建一次該數組？
break語句是否設置了鏈接，以便它移動到下一個源？
只檢查少於一週的鏈接？

下面是我在做什麼：

//$this->set_content_source_cache goes through all 50,000 rows and adds each link to an array so that it's array('link', 'link', 'link', etc.) 
    $cache_source_array = $this->set_content_source_cache(); 

    $qry = "select source, source_id, source_name, geography_id, industry_id from content_source"; 
    foreach($this->sql->result($qry) as $row_source) { 

     $feed = simplexml_load_file($row_source['source']); 

     if(!empty($feed)) { 

      for ($i=0; $i < 10 ; $i++) { 
       // most often there are only 10 feeds per rss. Since we check every 2 minutes, if there are 
        // a few more, then meh, we probably got it last time around 
       if(!empty($feed->channel->item[$i])) { 
        // make sure that the item is not blank 
        $title = $feed->channel->item[$i]->title; 
        $content = $feed->channel->item[$i]->description; 
        $link = $feed->channel->item[$i]->link; 
        $pubdate = $feed->channel->item[$i]->pubdate; 
        $source_id = $row_source['source_id']; 
        $source_name = $row_source['source_name']; 
        $geography_id = $row_source['geography_id']; 
        $industry_id = $row_source['industry_id']; 

        // random stuff in here to each link/article to make it data-worthy 
        if(!isset($cache_source_array[$link])) { 

         // start the transaction 
         $this->db->trans_start(); 

         $qry = "insert into content (headline, content, link, article_date, status, source_id, source_name, ". 
          "industry_id, geography_id) VALUES ". 
          "(?, ?, ?, ?, 2, ?, ?, ?, ?)"; 
         $this->db->query($qry, array($title, $content, $link, $pubdate, $source_id, $source_name, $industry_id, $geography_id)); 

         // this is my framework's version of mysqli_insert_id() 
         $content_id = $this->db->insert_id(); 

         $qry = "insert into content_ratings (content_id, comment_count, company_count, contact_count, report_count, read_count) VALUES ". 
          "($content_id, '0', '0', 0, '0', '0')"; 
         $result2 = $this->db->query($qry); 

         $this->db->trans_complete(); 

         if($this->db->trans_status() == TRUE) { 
          $cache_source_array[$link] = $content_id; 
          echo "Good!<br />"; 
         } else { 
          echo "Bad!<br />"; 
         } 
        } else { 
         // link alread exists 
         echo "link exists!"; 
        } 
       } 
      } 
     } else { 
      // feed is empty 
     } 
    } 
}

來源

2012-09-19 Jacob Kranz