解壓縮文件時，在將它們寫入文件之前刪除不需要的行

嘗試解壓縮文件時，出現錯誤，刪除我不感興趣的行，最後將其餘行寫入文件。這裏是我的代碼：解壓縮文件時，在將它們寫入文件之前刪除不需要的行

import gzip, os, sys 
dataset_names=[] 
dir_path=('local drive path') 
dataset_names= os.listdir(dir_path) 
count=0 
read_zip = []; 
for dataset in dataset_names: 
     each_dataset=os.path.join(dir_path+'\\'+dataset+'\\'+'soft'+'\\'+dataset+'_full'+'.soft') 
     with gzip.open(each_dataset+'.gz', 'rb') as each_gzip_file: 
      if count == 2: # I wanted to check with 2 datasets first 
       continue; 
      for line in each_gzip_file:  
       if line.startwith !=('#', '!', '^'): 
        continue; 
       read_zip.append('\t' + line); 
      with open('name of a file', 'wb') as f:     

        f.writelines(read_zip) 
     print(dataset); 
     count+=1;

這裏是我的錯誤：

AttributeError: 'bytes' object has no attribute 'startwith'

然後我試圖將其更改爲下面的代碼：

...... 
.......    
for line in each_gzip_file: 
       if not PY3K: 
        if lines.startwith != ('#', '!', '^'): 
         continue; 
        lines.append(line) 

       else: 
        lines.append(line.decode('cp437'))     
        makeitastring = ''.join(map(str, lines)) 
       with open('fine name', 'wb') as f: 

        my_str_as_bytes = str.encode(str(,lines)) 
        f.writelines(makeitastring)

這一次得到這個錯誤：

TypeError: a bytes-like object is required, not 'str'

我也改變了它與以下，但它也沒有工作。這就像它一遍又一遍地迭代：

for line in each_gzip_file: 
       read_zip.append(line); 
       for x in read_zip: 
        if str(x).startswith != ('#', '!', '^'): 
        continue;       
       else: 
        final.append(x);       

       with open('file name', 'ab') as f: 

       f.writelines(final)

我錯過了什麼嗎？謝謝，

來源

2017-09-15 NinaDev

你能指定哪一行實際觸發錯誤嗎？ – Saustin

@Saustin for each_gzip_file中的行： if line.startwith！=（'＃'，'！'，'^'）： continue; – NinaDev

你嘗試過'str（line）.startwith！= ...'？ – VBB

我看到有兩個錯誤。首先，你拼錯方法名稱。它是bytes.startswith()，而不是bytes.startwith()。注意「開始」和「與」之間的「s」。

其次，代碼line.startswith != ('#', '!', '^')沒有做你的想法。 startswith()是bytes對象的一種方法，而您想要的是以'#'等作爲參數調用該方法。現在，你問「這種方法等於三個字符串的這個元組？」。在這種情況下這沒什麼意義，但Python會高興地返回False。您想要的是line.startswith((b'#', b'!', b'^'))。（b對於區分字符串是必要的，因爲它們在Python 3中是不同的。）如果行以這三個字符中的任何一個字符開始，這將返回True。

來源

2017-09-15 05:30:20 bnaecker

謝謝@bnaecher，但這是錯誤：我仍然得到一個錯誤：如果line.startswith（b'＃'，b'！'，b'^'）： TypeError：切片索引必須是整數或無或有__index__方法 – NinaDev

'bytes.startswith'需要一個字節對象或字節對象的元組作爲第一個參數，但是因爲您遺留了括號，所以將它們作爲第1，第2和第3個參數傳入。預計第二個和第三個參數是整數，這會導致您看到的錯誤消息。再次閱讀bnaecker的代碼，在那裏是正確的。 – Jeronimo

@bnaecker謝謝！你是對的，我被一對括號忽略了！ – NinaDev

解壓縮文件時，在將它們寫入文件之前刪除不需要的行

回答

相關問題