我有一個數據庫已用於研究目的。不幸的是,在這項研究中,算法被允許進行時間過長,這無意中創建了重複的分類術語,而不是重複使用術語第一個實例的原始TID。Drupal:合併分類術語與大規模重複
爲了解決這個問題,我們嘗試使用「term_merge」和「taxonomy_manager」模塊。 「term_merge」提供了一個用於刪除重複項的界面,並且它能夠設置限制它在一段時間內加載的條款,以防止耗盡數據庫服務器的內存限制。但是,在我的用例中,我甚至無法加載位於/ admin/structure/taxonomy/[My-Vocabulary]/merge的配置屏幕,更不用說在/ admin/structure/taxonomy/[My -Vocabulary]/merge/duplicates,因爲儘管所述限制被設置爲1024M,但這兩者都耗盡了內存限制。
爲了解決這個問題,我寫了一個自定義模塊,它調用了term_merge模塊中找到的term_merge函數。由於該項目中只有一個節點捆綁包使用了分類詞彙表,因此我能夠安全編寫自己的邏輯來合併重複的詞彙,而無需使用term_merge模塊提供的功能,但我希望利用它,因爲它是爲此目的而設計的,理論上允許一個更安全的過程。
我的模塊提供了一個頁面回調以及邏輯來獲取一個TID列表,這個TID列表引用了一個重複的分類術語。這裏是一個包含調用term_merge函數的代碼:
//Use first element, with lowest TID value, as the 'trunk'
// which all other terms will be merged into
$trunk = $tids[0];
//Remove first element from branch array, to ensure the trunk
//is not being merged into itself
array_shift($tids);
//Set the merge settings array, similarly to the default values
//which are given in _term_merge_batch_process of term_merge.batch.inc
$merge_settings = array(
'term_branch_keep' => FALSE,
'merge_fields' => array(),
'keep_only_unique' => TRUE,
'redirect' => -1,
'synonyms' => array(),
);
term_merge($tids, $trunk, $merge_settings);
這不會導致任何合併條款,也沒有提供看門狗任何錯誤或通知或網絡服務器日誌。
我也嘗試調用term_merge爲每個單獨的重複TID進行合併,而不是使用整個TID數組。
我將不勝感激關於如何最好地以編程方式使用term_merge函數的任何輸入,或將允許我從大型數據庫中刪除許多重複條款,其中有些條款有成千上萬的重複項。
僅供參考,以下是其提供關於term_merge採取的參數,在貢獻term_merge模塊term_merge.module發現信息的評論:
/**
* Merge terms one into another using batch API.
*
* @param array $term_branch
* A single term tid or an array of term tids to be merged, aka term branches
* @param int $term_trunk
* The tid of the term to merge term branches into, aka term trunk
* @param array $merge_settings
* Array of settings that control how merging should happen. Currently
* supported settings are:
* - term_branch_keep: (bool) Whether the term branches should not be
* deleted, also known as "merge only occurrences" option
* - merge_fields: (array) Array of field names whose values should be
* merged into the values of corresponding fields of term trunk (until
* each field's cardinality limit is reached)
* - keep_only_unique: (bool) Whether after merging within one field only
* unique taxonomy term references should be kept in other entities. If
* before merging your entity had 2 values in its taxonomy term reference
* field and one was pointing to term branch while another was pointing to
* term trunk, after merging you will end up having your entity
* referencing to the same term trunk twice. If you pass TRUE in this
* parameter, only a single reference will be stored in your entity after
* merging
* - redirect: (int) HTTP code for redirect from $term_branch to
* $term_trunk, 0 stands for the default redirect defined in Redirect
* module. Use constant TERM_MERGE_NO_REDIRECT to denote not creating any
* HTTP redirect. Note: this parameter requires Redirect module enabled,
* otherwise it will be disregarded
* - synonyms: (array) Array of field names of trunk term into which branch
* terms should be added as synonyms (until each field's cardinality limit
* is reached). Note: this parameter requires Synonyms module enabled,
* otherwise it will be disregarded
* - step: (int) How many term branches to merge per script run in batch. If
* you are hitting time or memory limits, decrease this parameter
*/
如果您認爲是解決方案,您可以接受自己的答案。 – pal4life