讀取文件的\ n，而是忽略最後\ n

我有一個名爲LIST.TXT文件看起來像這樣分離：讀取文件的 n，而是忽略最後 n

input1 
input2 
input3

我敢肯定有最後一行後無空行（輸入3 ）。然後，我有一個Python腳本將由線讀取這個文件線和文字寫入一些文字創建3個文件（每行一個）：

import os 
os.chdir("/Users/user/Desktop/Folder") 

with open('list.txt','r') as f: 
    lines = f.read().split('\n') 

    #for l in lines: 
     header = "#!/bin/bash \n#BSUB -J %s.sh \n#BSUB -o /scratch/DBC/user/%s.sh.out \n#BSUB -e /scratch/DBC/user/%s.sh.err \n#BSUB -n 1 \n#BSUB -q normal \n#BSUB -P DBCDOBZAK \n#BSUB -W 168:00\n"%(l,l,l) 
     script = "cd /scratch/DBC/user\n" 
     script2 = 'grep "input" %s > result.%s.txt\n'%(l,l) 
     all= "\n".join([header,script,script2]) 

     with open('script_{}.sh'.format(l), 'w') as output: 
      output.write(all)

我的問題是，這將創建4個文件，不3：script_input1.sh，script_input.sh，script_input3.sh和script_.sh。最後一個文件沒有文本，其他文件將具有input1或input2或input3。

似乎Python逐行讀取我的list.txt，但是當它到達「input3」時，它以某種方式繼續？我該如何讓Python逐行讀取我的文件，用「\ n」分隔，但在最後一個文本之後停止？

來源

2017-10-11 m93

的[從文件中讀取列表中刪除換行符（可能的複製https://stackoverflow.com/questions/4319236/remove-the-newline-character-in-a-list-read -from-a-file） – Mort

我會再說一次[https://stackoverflow.com/questions/46685755/python-script-to-make-multiple-bash-scripts#comment80321657_46685755]：你可能應該重新思考你的方法。 – tripleee

首先，不讀整個文件到內存中，當你沒有太多 - 文件是可迭代的，所以按行讀取文件的正確方法是：

with open("/path/to/file.ext") as f: 
    for line in f: 
     do_something_with(line)

現在我n您的for循環，你只需要剝離線，如果它是空的，忽略它：

with open("/path/to/file.ext") as f: 
    for line in f: 
     line = line.strip() 
     if not line: 
      continue 
     do_something_with(line)

略無關，但是Python有多個字符串，所以你不需要串聯之一：

# not sure I got it right actually ;) 
script_tpl = """ 
#!/bin/bash 
#BSUB -J {line}.sh 
#BSUB -o /scratch/DBC/user/{line}.sh.out 
#BSUB -e /scratch/DBC/user/{line}.sh.err 
#BSUB -n 1 
#BSUB -q normal 
#BSUB -P DBCDOBZAK 
#BSUB -W 168:00 
cd /scratch/DBC/user 
grep "input" {line} > result.{line}.txt 
""" 

with open("/path/to/file.ext") as f: 
    for line in f: 
     line = line.strip() 
     if not line: 
      continue 
     script = script_tpl.format(line=line) 
     with open('script_{}.sh'.format(line), 'w') as output: 
      output.write(script)

最後一點：避免更改腳本中的目錄，使用os.path.join()來代替絕對路徑。

來源

2017-10-11 15:01:41

謝謝@bruno desthuilliers。關於最後一條評論的問題：在下面一行中：「with open（'script _ {}。sh'.format（l），'w'）作爲輸出：」，我應該用「line」替換「l」吧？因爲我不再定義這個腳本 – m93

當然是 - 我修正了它。 –

最後一個問題，部分說：「line = line.strip（）;如果不是line：continue」：是說：去掉空白行還是換行符？如果沒有這樣的空白或換行符繼續？對不起，我對Python很陌生，所以對我不是很清楚 – m93

使用您目前的做法，你會想：

檢查中lines的最後一個元素是空的（lines[-1] == ''）
如果是這樣，將其丟棄（lines = lines[:-1]）。

with open('list.txt','r') as f: 
    lines = f.read().split('\n') 

if lines[-1] == '': 
    lines = lines[:-1] 

for line in lines:  
    print(line)

不要忘記，對於一個文件不是一個換行符（與在最後一個空行）結束它是合法的......這將處理這個情況。

此外，作爲@setsquare指出的那樣，你可能想使用readlines()嘗試：

with open('list.txt', 'r') as f: 
    lines = [ line.rstrip('\n') for line in f.readlines() ] 

for line in lines: 
    print(line)

來源

2017-10-11 14:47:38 Attie

如果最後有多個空白行怎麼辦？ – randomir

如果處理空白行是值得關注的，那麼我們有一個不同的問題......這只是處理常見的「_empty最後一行_」 – Attie

你有沒有考慮過使用readlines方法（），而不是閱讀（）？這將讓Python爲您處理最後一行是否有\ n或不是。

請記住，如果輸入文件在最後一行有\ n，那麼使用read（）和'\ n'分割將創建一個額外的值。例如：

my_string = 'one\ntwo\nthree\n' 
my_list = my_string.split('\n') 
print my_list 
# >> ['one', 'two', 'three', '']

潛在的解決方案

lines = f.readlines() 
# remove newlines 
lines = [line.strip() for line in lines] 
# remove any empty values, just in case 
lines = filter(bool, lines)

舉個簡單的例子，在這裏看到：How do I read a file line-by-line into a list?

來源

2017-10-11 14:52:44 setsquare

爲什麼要使用'readlines（）'呢？ 'line = [line.strip（）for line in f]'做同樣的事情。但是這不會解決OP問題 - 您仍然需要過濾掉空行。 –

夠公平 - 增加編輯 – setsquare

我想你是用錯了。

如果您具備以下條件：

text = 'xxx yyy' 
text.split(' ') # or simply text.split()

其結果將是

['xxx', 'yyy']

現在，如果您有：

text = 'xxx yyy ' # extra space at the end 
text.split()

其結果將是

['xxx', 'yyy', '']

，因爲拆分得到每個''（空格）之前和之後的內容。在這種情況下，最後一個空格後面會有空字符串。

有些功能你可以使用：

strip([chars]) # This removes all chars at the beggining or end of a string

例子：

text = '___text_about_something___' 
text.strip('_')

結果將是：

'text_about_something'

在特定的問題，你可以簡單地說：

lines = f.readlines() # read all lines of the file without '\n' 
for l in lines: 
    l.strip(' ') # remove extra spaces at the start or end of line if you need

來源

2017-10-11 15:07:52 klaus

f.read()返回一個以換行符結尾的字符串，其中split將其最後一行從空字符串中分離出來。目前尚不清楚爲什麼你明確地將整個文件讀入內存;只是迭代文件對象並讓它處理行分割。

with open('list.txt','r') as f: 
    for l in f: 
     # ...

來源

2017-10-11 15:08:37 chepner

讀取文件的\ n，而是忽略最後\ n

回答

相關問題