2017-05-31 96 views
0

我想使用拆分字符串方法從每行提取信息到列表中。讀取文件 - python?

+1

你的文件格式是什麼?你可能不想丟棄行信息,這是'.read().span()'將會做的(它在所有空格上分割)。 – Ryan

+0

.readlines()可能更好 –

+0

@Ryan它只是一個表,最後的名字,然後考試1,然後考試2(直到考試4) – Nora

回答

1

使用splitlines,它的更好:

file = open('scores.txt','r').read().splitlines() 
exam_one = [] 
for line in file: 
    line = line.split() # not strip 
    exam_one.append(int(line[2])) # or better use float() since it's an exam 
print(exam_one) # => [100, 82, 94, 89, 87] 
+0

謝謝,但這怎麼回答我的問題 – Nora

+0

順便說一下,exam_one ='line [2]',而不是'[2 :: 8]' –

0

先別將整個文件讀入內存中。文件對象是迭代器。

result = [] 
with open('scores.txt') as f: 
    for line in f: 
     # E.g., fields == ['Hopper,', 'Grace', '100', '98', '87', '97'] 
     fields = line.strip().split() 

目前還不清楚你想要什麼作爲最終結果;每條線的一年級,也許?分割線後,你可以得到與

result.append(fields[2]) 
+0

我寫了一個示例結果在我的評論結尾..我想創建包含每列內容的列表 – Nora

+0

該列表包含* first *列的內容,正如我在問題中所假設的那樣。它不包含*每個*(或*每個*)列的內容。 – chepner

+0

當我使用你的代碼時,它只打印最後一行。 – Nora

1

我不知道怎麼是你的文件,但我認爲這是這樣的:

Hopper, Grace 100 98 87 97 
Knuth, Donald 82 87 92 81 
Goldberg, Adele 94 96 90 91 
Kernighan, Brian 89 74 89 77 
Liskov, Barbara 87 97 81 85 

而且我不明白你喲想要什麼像輸出,但我認爲是這樣的:

[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']] 

我開發這一行代碼(對於Python 3.6):

with open('scores.txt', 'r') as file: 
    print([[value for value in line.strip().replace(',','').split()] for line in file]) 

同:

with open('scores.txt', 'r') as file: 
    tmp = [] 
    for line in file: 
     tmp.append(line.strip().replace(',','').split()) 
     # Also you can delete tmp = [] and replace the tmp.append(...) line to tmp = [var for var in line.strip().replace(',','').split()] 
print(tmp) 

輸出:

[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']] 

同爲:

[ 
    ['Hopper,', 'Grace', '100', '98', '87', '97'], 
    ['Knuth,', 'Donald', '82', '87', '92', '81'], 
    ['Goldberg,', 'Adele', '94', '96', '90', '91'], 
    ['Kernighan,', 'Brian', '89', '74', '89', '77'], 
    ['Liskov,', 'Barbara', '87', '97', '81', '85'] 
] 

我用喜歡和輸出print()但你可以定義一個變量是你想要的。

PD:我已發現一種更簡單的解決方案:

with open('scores.txt', 'r') as file: 
    print([line.split() for line in file.read().replace(',','').splitlines()]) 
+0

請不要暗示在一行中寫出所有內容是一個好主意。 – chepner

+0

@chepner我知道,所以我也寫了倍數行代碼。對程序員的偏好。 –

+0

@dawg好吧,我會盡我所能,但我不知道多少編碼。 –

1

假設你具有字下面的字符串(由水平空白分隔)和線(由\n或垂直空白分隔):

>>> print(data) 
Hopper, Grace 100 98 87 97 
Knuth, Donald 82 87 92 81 
Goldberg, Adele 94 96 90 91 
Kernighan, Brian 89 74 89 77 
Liskov, Barbara 87 97 81 85 

如果只是用.split()你寬鬆的線條和文字之間的所有差異:

>>> data.split() 
['Hopper,', 'Grace', '100', '98', '87', '97', 'Knuth,', 'Donald', '82', '87', '92', '81', 'Goldberg,', 'Adele', '94', '96', '90', '91', 'Kernighan,', 'Brian', '89', '74', '89', '77', 'Liskov,', 'Barbara', '87', '97', '81', '85'] 

要保持差異,需要.splitlines().split()結合:

>>> [line.split() for line in data.splitlines()] 
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']] 

同樣的概念也適用於從文件中讀取數據。而不是使用.splitlines()你可以將文件的各個行遍歷一個for循環:

>>> with open('/tmp/file.txt') as f: 
... for line in f: 
...  print(line.split()) 
... 
['Hopper,', 'Grace', '100', '98', '87', '97'] 
['Knuth,', 'Donald', '82', '87', '92', '81'] 
['Goldberg,', 'Adele', '94', '96', '90', '91'] 
['Kernighan,', 'Brian', '89', '74', '89', '77'] 
['Liskov,', 'Barbara', '87', '97', '81', '85'] 

或者,如果你想嵌套列表:

>>> with open('/tmp/file.txt') as f: 
... print([line.split() for line in f]) 
... 
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']] 

如果你想從這些只是一個數字行:

>>> with open('/tmp/file.txt') as f: 
... print([line.split()[2] for line in f]) 
... 
['100', '82', '94', '89', '87'] 

打開一個文件,並遍歷與for環或列表解析中的線的形式被認爲是一IMP ortant Python成語。使用這些而不是將整個文件讀入內存。