bash編程，後臺進程，PID和等待作業退出

我寫了一個bash腳本來持續運行作業來生成大量的模擬數據。bash編程，後臺進程，PID和等待作業退出

本質上，一旦腳本運行後，它應該不斷啓動後臺進程來生成數據，並且受限於不能超過32個同時運行的後臺作業。這是防止進程吞噬所有可用內存並拖延服務器所必需的。

我的想法是在後臺啓動bash函數並存儲這些作業的PID。然後，在啓動32個作業之後，使用等待命令至wait，直到所有PID作業完成執行。

我覺得wait就是用這裏這麼久，當等待命令運行（它會因爲模擬需要6個小時運行）的存在進程的PID，然後等待命令將檢測正確的工具進程退出。

這似乎是一個更好的選擇，而不僅僅是輪詢過程和檢查是否存在特定的PID，因爲PID被循環利用，另一個過程可能在我們完成相同的PID後開始。（只是偶然的機會，如果我們不幸）

但是，使用wait方法的缺點是，如果進程不按順序退出，那麼wait將被調用，以使PID不再存在除非一個新過程重複使用了與我們之前記錄的PID相同的PID，並且另外，如果一個工作比其他工作要長得多（再次偶然），那麼當有工作空間時，我們將等待一個工作結束該系統另一個31組的工作，因爲大家都在等待這最後PID退出不能運行......

這可能變得有點難以想像，所以讓我添加一些代碼...

我使用while循環，這種「算法」

c=0 # count total number of jobs launched (will not really use this here) 
PIDS=() # keep any array of PIDs 

# maximum number of simultaneous jobs and counter 
BATCH_SIZE=32 
BATCH_COUNT=0 

# just start looping 
while true 

    # edit: forgot to add this initially 
    # just check to see if a job has been run using file existance 
    if [ ! -e "$FILE_NAME_1" ] 
    then 

     # obvious 
     if [ "$BATCH_COUNT" -lt "$BATCH_SIZE" ] 
     then 

      ((BATCH_COUNT += 1)) 

      # this is used elsewhere to keep track of whether a job has been executed (the file existence is a flag)  
      touch "$FILE_NAME_1" 
      # call background job, parallel_job_run is a bash function 
      parallel_job_run $has_some_arguments_but_not_relevent 
      # get PID 
      PID=$! 
      echo "[ JOB ] : Launched job as PID=$PID" 
      PIDS+=($PID) 

      # count total number of jobs 
      ((c=c+1)) 
     fi 

    else 
     # increment file name to use as that file already exists   
     # the "files" are for input/output 
     # the details are not particularly important 
    fi 

    true # prevent exit 

# the following is a problem 
do  
    if ((BATCH_COUNT < BATCH_SIZE)) 
    then 
     continue 
    else 
     # collect launched jobs 
     # this does not collect jobs in the order that they finish 
     # it will first wait for the first PID in the array to exit 
     # however this job may be the last to finish, in which case 
     # wait will be called with other array values with PID's which 
     # have already exited, and hence it is undefined behaviour 
     # as to whether we wait for a PID which doesn't exist (no problem) 
     # or a new process may have started which re-uses our PID 
     # and therefore we are waiting for someone else's process 
     # to finish which is nothing to do with our own jobs! 
     # we could be waiting for the PID of someone else's tty login 
     # for example! 
     for pid in "${PIDS[@]}" 
     do 
      wait $pid || echo "failed job PID=$pid" 
      ((BATCH_COUNT -= 1)) 
     done 
    fi 

done

的基礎上希望的意見，並在上面的代碼代碼和註釋的組合應該明確什麼，我試圖做的。

我的另一個想法是用另一個循環替換for循環，該循環不斷檢查每個PID是否存在。（輪詢）。這可以與sleep 1結合使用以防止CPU佔用。然而，這個問題與以前一樣，我們的過程可能會退出釋放它的PID，並且可能會運行另一個過程來獲取該PID。這種方法的優點是我們永遠不會等待超過1秒鐘，然後再啓動一個新流程，而前一個流程退出。

任何人都可以給我任何關於如何處理我在這裏遇到的問題的建議嗎？

今天我會不斷更新這個問題 - 例如，如果我找到任何信息並通過格式化/重新說明部分來更清楚地添加新信息。

來源

2016-11-01 user3728501

如果在wait中使用-n選項，它將等待下一個進程完成，無論其PID如何。所以，這可能是一個解決方案。

此外，Linux並不像您似乎暗示的那樣立即回收PID。它將下一個可用的PID按順序分配給新進程，並且僅在耗盡最大可用PID後才從頭開始。

來源

2016-11-01 12:09:26 unxnut

bash編程，後臺進程，PID和等待作業退出

回答

相關問題