熊貓數據幀的循環調度

我一直在研究一段代碼，該代碼讀取製表符分隔的CSV文件，它代表一系列進程及其啓動時間和持續時間，並使用熊貓爲其創建數據幀。然後，我需要應用簡化的循環調度形式來查找過程的週轉時間，並從用戶輸入中獲取時間片。熊貓數據幀的循環調度

到目前爲止，我可以在CSV文件中讀取標籤並對其進行正確排序。但是，當試圖構建循環遍歷行以查找每個進程的完成時間時，我會卡住。

到目前爲止的代碼看起來像：

# round robin 
def rr(): 
    docname = (sys.argv[1]) 
    method = (sys.argv[2]) 
    # creates a variable from the user input to define timeslice 
    timeslice = int(re.search(r'\d+', method).group()) 
    # use pandas to create a 2-d data frame from tab delimited file, set column 0 (process names) to string, set column 
    # 1 & 2 (start time and duration, respectively) to integers 
    d = pd.read_csv(docname, delimiter="\t", header=None, dtype={'0': str, '1': np.int32, '2': np.int32}) 
    # sort d into d1 by values of start times[1], ascending 
    d1 = d.sort_values(by=1) 
    # Create a 4th column, set to 0, for the Completion time 
    d1[3] = 0 
    # change column names 
    d1.columns = ['Process', 'Start', 'Duration', 'Completion'] 
    # intialize counter 
    counter = 0 
    # if any values in column 'Duration' are above 0, continue the loop 
    while (d1['Duration']).any() > 0: 
     for index, row in d1.iterrows(): 
      # if value in column 'Duration' > the timeslice, add the value of the timeslice to the current counter, 
      # subtract it from the the current value in column 'Duration' 
      if row.Duration > timeslice: 
       counter += timeslice 
       row.Duration -= timeslice 
       print(index, row.Duration) 
      # if value in column "Duration" <= the timeslice, add the current value of the row:Duration to the counter 
      # subtract the Duration from itself, to make it 0 
      # set row:Completion to the current counter, which is the completion time for the process 
      elif row.Duration <= timeslice and row.Duration != 0: 
       counter += row.Duration 
       row.Duration -= row.Duration 
       row.Completion = counter 
       print(index, row.Duration) 
      # otherwise, if the value in Duration is already 0, print that index, with the "Done" indicator 
      else: 
       print(index, "Done")

鑑於樣本CSV文件，d1看起來像

Process Start Duration Completion 
3  p4  0  280   0 
0  p1  5  140   0 
1  p2  14  75   0 
2  p3  36  320   0 
5  p6  40   0   0 
4  p5  67  125   0

當我與timeslice = 70我的代碼運行，我得到一個無限循環：

它似乎是正確迭代循環ONC e，然後無限重複。但是，print(d1['Completion'])給出了全0的值，這意味着它不會將正確的counter值分配給d1['Completion']。

理想情況下，Completion值將填寫自己的相應時間，給出timeslice=70，如：

Process Start Duration Completion 
3  p4  0  280   830 
0  p1  5  140   490 
1  p2  14  75   495 
2  p3  36  320   940 
5  p6  40   0   280 
4  p5  67  125   620

，我可以再使用查找平均週轉時間。但是，出於某種原因，我的循環看起來會迭代一次，然後無休止地重複。當我嘗試切換while和for語句的順序時，它會重複迭代每一行直到它達到0，同時給出不正確的完成時間。

在此先感謝。

來源

2016-10-06 Z. Winters

其實你不修改的dataframe.Try每一行的值列表中的分析數據，然後在列表中修改它們。 – Acepcs

有沒有一種方法來保持與進程名稱相關的已分析數據的順序？我想到了你在說什麼，但無法確定哪一個完成時間是哪個過程。我最終完成了按完成時間排序的按字母排序。對不起，我對Python非常新，閱讀文檔無法以我理解的方式解釋它。 –

我修改了一下你的代碼，它工作正常。你不能用你修改的值覆蓋原始值，所以循環不會結束。

while (d1['Duration']).any() > 0: 
    for index, row in d1.iterrows(): 
     # if value in column 'Duration' > the timeslice, add the value of the timeslice to the current counter, 
     # subtract it from the the current value in column 'Duration' 
     if row.Duration > timeslice: 
      counter += timeslice 
      #row.Duration -= timeslice 
      # !!!LOOK HERE!!! 
      d1['Duration'][index] -= timeslice 
      print(index, row.Duration) 
     # if value in column "Duration" <= the timeslice, add the current value of the row:Duration to the counter 
     # subtract the Duration from itself, to make it 0 
     # set row:Completion to the current counter, which is the completion time for the process 
     elif row.Duration <= timeslice and row.Duration != 0: 
      counter += row.Duration 
      #row.Duration -= row.Duration 
      #row.Completion = counter 
      # !!!LOOK HERE!!! 
      d1['Duration'][index] = 0 
      d1['Completion'][index] = counter 
      print(index, row.Duration) 
     # otherwise, if the value in Duration is already 0, print that index, with the "Done" indicator 
     else: 
      print(index, "Done")

順便說一句，我猜你可能要模擬的進程調度算法。在這種情況下，你必須考慮'開始'，因爲不是每個過程都在同一時間開始。

（你理想的表是不知何故錯誤。）

來源

2016-10-06 04:50:42 Acepcs

熊貓數據幀的循環調度

回答

相關問題