numpy的陣列具有讀（）（Python）的

對於文本文件（testfile.txt）：numpy的陣列具有讀（）（Python）的

# blah blah blah 

Unpleasant astonished an diminution up. Noisy an their of meant. Death means up civil do an offer wound of. 
//Called square an in afraid direct. 


{Resolution} diminution conviction so (mr at) unpleasing simplicity no. 
/*No it as breakfast up conveying earnestly

當存儲一個numpy的陣列內的文本文件的內容，我無法理解差之間：

當文本文件被直接打開（無read()）並存儲在numpy的陣列中，並且

B.當文本文件被首先用打開A. 0，然後存儲在numpy數組中。

下面是代碼：

import numpy  

# open directly with no read 
a = numpy.array([str(i) for i in open(r'C:\testfile.txt', 'r')]) 

# open with read then store in numpy *how I want to do it* 
f = open(r'C:\testfile.txt', 'r').read() 
b = numpy.array([str(i) for i in f]) 

print("A") 
print(a) 
print() 
print("B") 
print(b)

我的問題是如何改變numpy.array([str(i) for i in f])命令，這樣所產生的numpy的數組保存文本文件的內容的方式，輸出的確實（見下文）。

輸出：

A 
['# blah blah blah\n' '\n' 
'Unpleasant astonished an diminution up. Noisy an their of meant. Death means up civil do an offer wound of. \n' 
'//Called square an in afraid direct. \n' '\n' '\n' 
'{Resolution} diminution conviction so (mr at) unpleasing simplicity no. \n' 
'/*No it as breakfast up conveying earnestly '] 

B 
['#' ' ' 'b' 'l' 'a' 'h' ' ' 'b' 'l' 'a' 'h' ' ' 'b' 'l' 'a' 'h' '\n' '\n' 
'U' 'n' 'p' 'l' 'e' 'a' 's' 'a' 'n' 't' ' ' 'a' 's' 't' 'o' 'n' 'i' 's' 
'h' 'e' 'd' ' ' 'a' 'n' ' ' 'd' 'i' 'm' 'i' 'n' 'u' 't' 'i' 'o' 'n' ' ' 
'u' 'p' '.' ' ' 'N' 'o' 'i' 's' 'y' ' ' 'a' 'n' ' ' 't' 'h' 'e' 'i' 'r' 
' ' 'o' 'f' ' ' 'm' 'e' 'a' 'n' 't' '.' ' ' 'D' 'e' 'a' 't' 'h' ' ' 'm' 
'e' 'a' 'n' 's' ' ' 'u' 'p' ' ' 'c' 'i' 'v' 'i' 'l' ' ' 'd' 'o' ' ' 'a' 
'n' ' ' 'o' 'f' 'f' 'e' 'r' ' ' 'w' 'o' 'u' 'n' 'd' ' ' 'o' 'f' '.' ' ' 
'\n' '/' '/' 'C' 'a' 'l' 'l' 'e' 'd' ' ' 's' 'q' 'u' 'a' 'r' 'e' ' ' 'a' 
'n' ' ' 'i' 'n' ' ' 'a' 'f' 'r' 'a' 'i' 'd' ' ' 'd' 'i' 'r' 'e' 'c' 't' 
'.' ' ' '\n' '\n' '\n' '{' 'R' 'e' 's' 'o' 'l' 'u' 't' 'i' 'o' 'n' '}' ' ' 
'd' 'i' 'm' 'i' 'n' 'u' 't' 'i' 'o' 'n' ' ' 'c' 'o' 'n' 'v' 'i' 'c' 't' 
'i' 'o' 'n' ' ' 's' 'o' ' ' '(' 'm' 'r' ' ' 'a' 't' ')' ' ' 'u' 'n' 'p' 
'l' 'e' 'a' 's' 'i' 'n' 'g' ' ' 's' 'i' 'm' 'p' 'l' 'i' 'c' 'i' 't' 'y' 
' ' 'n' 'o' '.' ' ' '\n' '/' '*' 'N' 'o' ' ' 'i' 't' ' ' 'a' 's' ' ' 'b' 
'r' 'e' 'a' 'k' 'f' 'a' 's' 't' ' ' 'u' 'p' ' ' 'c' 'o' 'n' 'v' 'e' 'y' 
'i' 'n' 'g' ' ' 'e' 'a' 'r' 'n' 'e' 's' 't' 'l' 'y' ' ']

來源

2016-11-28 Karim

輸出，只需拆分的read()被分成幾行：

def load_entire_file_into_memory_and_then_convert(filename): 
    with open(filename, 'r') as input_file: 
    full_file_contents = input_file.read() 
    lines_of_file = full_file_contents.split('\n') 
    return numpy.array(lines_of_file)

還有你的另一版本：

def load_file_line_by_line(filename): 
    with open(filename, 'r') as input_file: 
    lines_of_file = [line for line in input_file] 
    return numpy.array(lines_of_file)

注意這兩個版本之間的語義差別以及爲什麼你得到不同的結果;當你在一個文件中做「for ... in」時，你得到的結果是單獨的行。如果您調用read()，那麼您將整個文件作爲單個字符串（用換行符分隔的行），並且字符串中的「for ... in」會爲您提供該字符串的各個字符（不是行）。雖然可能有些情況下使用read()更方便（例如，當你真的想要一次加載所有行）時，它通常更具可擴展性/更好的習慣來逐行處理文件（使用第一種方法），因爲這樣做允許您減少內存佔用（例如在其他不需要所有行同時在內存中的應用程序中，並且一次只能在文件的一行上運行的應用程序）。

來源

2016-11-28 07:19:56

對於文本文件'readlines（）'通常比'read（）'更有用。結果類似於在打開的文件上迭代。 – hpaulj

同意。 readlines（）與在文件對象中執行「for ... in」基本相同。但問題是，直接詢問如何用「read（）」來完成。 –

numpy的陣列具有讀（）（Python）的

回答

相關問題