2016-10-09 39 views
1

我有一個數據庫已用於研究目的。不幸的是,在這項研究中,算法被允許進行時間過長,這無意中創建了重複的分類術語,而不是重複使用術語第一個實例的原始TID。Drupal:合併分類術語與大規模重複

爲了解決這個問題,我們嘗試使用「term_merge」和「taxonomy_manager」模塊。 「term_merge」提供了一個用於刪除重複項的界面,並且它能夠設置限制它在一段時間內加載的條款,以防止耗盡數據庫服務器的內存限制。但是,在我的用例中,我甚至無法加載位於/ admin/structure/taxonomy/[My-Vocabulary]/merge的配置屏幕,更不用說在/ admin/structure/taxonomy/[My -Vocabulary]/merge/duplicates,因爲儘管所述限制被設置爲1024M,但這兩者都耗盡了內存限制。

爲了解決這個問題,我寫了一個自定義模塊,它調用了term_merge模塊中找到的term_merge函數。由於該項目中只有一個節點捆綁包使用了分類詞彙表,因此我能夠安全編寫自己的邏輯來合併重複的詞彙,而無需使用term_merge模塊提供的功能,但我希望利用它,因爲它是爲此目的而設計的,理論上允許一個更安全的過程。

我的模塊提供了一個頁面回調以及邏輯來獲取一個TID列表,這個TID列表引用了一個重複的分類術語。這裏是一個包含調用term_merge函數的代碼:

//Use first element, with lowest TID value, as the 'trunk' 
// which all other terms will be merged into 

$trunk = $tids[0]; 

//Remove first element from branch array, to ensure the trunk 
//is not being merged into itself 

array_shift($tids); 

//Set the merge settings array, similarly to the default values 
//which are given in _term_merge_batch_process of term_merge.batch.inc 

$merge_settings = array(
    'term_branch_keep' => FALSE, 
    'merge_fields' => array(), 
    'keep_only_unique' => TRUE, 
    'redirect' => -1, 
    'synonyms' => array(), 
); 

term_merge($tids, $trunk, $merge_settings); 

這不會導致任何合併條款,也沒有提供看門狗任何錯誤或通知或網絡服務器日誌。

我也嘗試調用term_merge爲每個單獨的重複TID進行合併,而不是使用整個TID數組。

我將不勝感激關於如何最好地以編程方式使用term_merge函數的任何輸入,或將允許我從大型數據庫中刪除許多重複條款,其中有些條款有成千上萬的重複項。

僅供參考,以下是其提供關於term_merge採取的參數,在貢獻term_merge模塊term_merge.module發現信息的評論:

/** 
* Merge terms one into another using batch API. 
* 
* @param array $term_branch 
* A single term tid or an array of term tids to be merged, aka term branches 
* @param int $term_trunk 
* The tid of the term to merge term branches into, aka term trunk 
* @param array $merge_settings 
* Array of settings that control how merging should happen.  Currently 
* supported settings are: 
*  - term_branch_keep: (bool) Whether the term branches should not be 
*  deleted, also known as "merge only occurrences" option 
*  - merge_fields: (array) Array of field names whose values should be 
*  merged into the values of corresponding fields of term trunk (until 
*  each field's cardinality limit is reached) 
*  - keep_only_unique: (bool) Whether after merging within one field only 
*  unique taxonomy term references should be kept in other entities. If 
*  before merging your entity had 2 values in its taxonomy term reference 
*  field and one was pointing to term branch while another was pointing to 
*  term trunk, after merging you will end up having your entity 
*  referencing to the same term trunk twice. If you pass TRUE in this 
*  parameter, only a single reference will be stored in your entity after 
*  merging 
*  - redirect: (int) HTTP code for redirect from $term_branch to 
*  $term_trunk, 0 stands for the default redirect defined in Redirect 
*  module. Use constant TERM_MERGE_NO_REDIRECT to denote not creating any 
*  HTTP redirect. Note: this parameter requires Redirect module enabled, 
*  otherwise it will be disregarded 
*  - synonyms: (array) Array of field names of trunk term into which branch 
*  terms should be added as synonyms (until each field's cardinality limit 
*  is reached). Note: this parameter requires Synonyms module enabled, 
*  otherwise it will be disregarded 
*  - step: (int) How many term branches to merge per script run in batch. If 
*  you are hitting time or memory limits, decrease this parameter 
*/ 

回答

0

這似乎是因爲函數term_merge與開發在一個函數中用於處理表單提交的意圖,我的自定義模塊以批處理過程未能被調用的方式使用它。

顯式調用下面的解決了這個:

batch_process() 

沒有參數需要傳遞給函數。

+0

如果您認爲是解決方案,您可以接受自己的答案。 – pal4life