在文本文件

以下文本文件我有樣品刪除行是：在文本文件

> 1 -4.6 -4.6 -7.6 
> 
> 2 -1.7 -3.8 -3.1 
> 
> 3 -1.6 -1.6 -3.1

數據由接片在文本文件中分離，並且將第一列表示的位置。

我需要遍歷文本文件中除列0之外的每個值並查找最低值。

一旦找到最低值，該值需要寫入新的文本文件以及列名和位置。列0具有名稱「位置」列1「十五」，列2「十六」和列3「十七」

例如，上述數據中的最低值是「-7.6」，並且在列3中名稱「十七」。因此，需要將「7.6」，「17」及其在本例中爲1的位置值寫入新的文本文件。

然後我需要從上面的文本文件中刪除一些行。

E.G.上面的最低值是「-7.6」並且在位置「1」處找到並且在列3中被找到，其名稱爲「十七」。因此，我需要從文本文件中刪除17行，從包括位置1開始幷包括位置1

因此，找到最低值的列表示需要刪除的行的數量以及它在狀態啓動刪除

來源

2010-04-12 Jenny

顯示使用您已經嘗試 – Mark 2010-04-12 16:07:23

你的要求是怪誕的代碼。這是什麼動機？ – 2010-04-12 16:08:48

它是一個包含生物數據的文本文件，我只需要找到沒有重疊的最低值，因此刪除。 – Jenny 2010-04-12 16:13:44

打開這個文件進行讀取，另一個文件進行寫操作的點，並複製所有不符合過濾條件的線路：在什麼我想你想

readfile = open('somefile', 'r') 
writefile = open('otherfile', 'w') 

for line in readfile: 
    if not somepredicate(line): 
    writefile.write(line) 

readfile.close() 
writefile.close()

來源

2010-04-12 16:08:54

當然，在這一點上，你可以把你的程序編寫成一個標準的輸入過濾器（從標準輸入讀取，寫入標準輸出），並從你的shell執行適當的重定向。 – jemfinch 2010-04-12 16:10:36

當然，這是一個可行的方法（以及我通常採用的方法）。 – 2010-04-12 16:15:45

這裏是一個刺（儘管你的要求很難遵循）：

def extract_bio_data(input_path, output_path): 
    #open the output file and write it's headers 
    output_file = open(output_path, 'w') 
    output_file.write('\t'.join(('position', 'min_value', 'rows_skipped')) + '\n') 

    #map column indexes (after popping the row number) to the number of rows to skip 
    col_index = { 0: 15, 
        1: 16, 
        2: 17 } 

    skip_to_position = 0 
    for line in open(input_path, 'r'): 
     #remove the '> ' from the beginning of the line and strip newline characters off the end 
     line = line[2:].strip() 

     #if the line contains no data, skip it 
     if line == '': 
      continue 

     #split the columns on whitespace (change this to split('\t') for splitting only on tabs) 
     columns = line.split() 

     #extract the row number/position of this data 
     position = int(columns.pop(0)) 

     #this is where we skip rows/positions 
     if position < skip_to_position: 
      continue 

     #if two columns share the minimum value, this will be the first encountered in the list 
     min_index = columns.index(min(columns, key=float)) 

     #this is an integer version of the 'column name' which corresponds to the number of rows that need to be skipped 
     rows_to_skip = col_index[min_index] 

     #write data to your new file (row number, minimum value, number of rows skipped) 
     output_file.write('\t'.join(str(x) for x in (position, columns[min_index], rows_to_skip)) + '\n') 

     #set the number of data rows to skip from this position 
     skip_to_position = position + rows_to_skip 


if __name__ == '__main__': 
    in_path = r'c:\temp\test_input.txt' 
    out_path = r'c:\temp\test_output.txt' 
    extract_bio_data(in_path, out_path)

的事情，是我不明白：

難道真有「>」在每行的開頭或者是複製/粘貼錯誤？
- 我認爲這不是一個錯誤。
是否要將「7.6」或「-7.6」寫入新文件？
- 我假設你想要原始值。
是否要跳過文件中的行？或基於第一列的職位？
- 我以爲你想跳過職位。
你說你想從原始文件中刪除數據。
- 我認爲跳過位置是足夠的。

來源

2010-04-12 19:34:31 tgray

回答

相關問題