Python：用islice一次讀取N個行數的問題

我正嘗試使用「from itertools import islice」以便從使用liblas模塊的* .las文件中一次讀取多行代碼。（我的目標是閱讀的塊狀bychunk）Python：用islice一次讀取N個行數的問題

以下問題：Python how to read N number of lines at a time

islice() can be used to get the next n items of an iterator. Thus, list(islice(f, n)) will return a list of the next n lines of the file f. Using this inside a loop will give you the file in chunks of n lines. At the end of the file, the list might be shorter, and finally the call will return an empty list.

我用下面的代碼：

from numpy import nonzero 
from liblas import file as lasfile 
from itertools import islice 


chunkSize = 1000000 

f = lasfile.File(inFile,None,'r') # open LAS 
while True: 
    chunk = list(islice(f,chunkSize)) 
    if not chunk: 
     break 
    # do other stuff

，但我有這個問題：

len(f) 
2866390 

chunk = list(islice(f, 1000000)) 
len(chunk) 
**1000000** 
chunk = list(islice(f, 1000000)) 
len(chunk) 
**1000000** 
chunk = list(islice(f, 1000000)) 
len(chunk) 
**866390** 
chunk = list(islice(f, 1000000)) 
len(chunk) 
**1000000**

當文件f到達時，islice重新開始讀取文件。

感謝您的任何建議和幫助。這是非常欣賞

來源

2012-10-08 Gianni Spear

爾加，那麼你的'lasfile.File'類型是打破所有迭代約定？ –

我有一個真正的壞的時刻與lasfile.File –

現在看來似乎會很容易足夠寫一發生器同時產生n行：

def n_line_iterator(fobj,n): 
    if n < 1: 
     raise ValueError("Must supply a positive number of lines to read") 

    out = [] 
    num = 0 
    for line in fobj: 
     if num == n: 
      yield out #yield 1 chunk 
      num = 0 
      out = [] 
     out.append(line) 
     num += 1 
    yield out #need to yield the rest of the lines

來源

2012-10-08 14:17:07 mgilson

謝謝，但在「如果num = N：」有一個問題文件「」，行16 如果num = N： ^ 語法錯誤：無效的語法 –

@詹尼 - 對不起，我出門一個星期，顯然忘記了如何編碼。那是一個錯誤。我已經更新並修復了那一個。讓我知道如果你找到更多。 – mgilson

沒問題，謝謝!!!我有與libals模塊非常糟糕的時刻。我無法看大塊 http://stackoverflow.com/questions/12769353/python-suggestions-to-improve-a-chunk-by-chunk-code-to-read-several-millions-of 它是兩天我正在嘗試:( –

變化file.py屬於liblas包的源代碼。目前__iter__定義爲（src on github）

def __iter__(self): 
    """Iterator support (read mode only) 

     >>> points = [] 
     >>> for i in f: 
     ... points.append(i) 
     ... print i # doctest: +ELLIPSIS 
     <liblas.point.Point object at ...> 
    """ 
    if self.mode == 0: 
     self.at_end = False 
     p = core.las.LASReader_GetNextPoint(self.handle) 
     while p and not self.at_end: 
      yield point.Point(handle=p, copy=True) 
      p = core.las.LASReader_GetNextPoint(self.handle) 
      if not p: 
       self.at_end = True 
     else: 
      self.close() 
      self.open()

你看，當文件是在結束其關閉並再次打開，所以迭代在文件的重新開始。

嘗試在一段時間後刪除最後一個else塊，所以該方法的正確的代碼應該是：

def __iter__(self): 
    """Iterator support (read mode only) 

     >>> points = [] 
     >>> for i in f: 
     ... points.append(i) 
     ... print i # doctest: +ELLIPSIS 
     <liblas.point.Point object at ...> 
    """ 
    if self.mode == 0: 
     self.at_end = False 
     p = core.las.LASReader_GetNextPoint(self.handle) 
     while p and not self.at_end: 
      yield point.Point(handle=p, copy=True) 
      p = core.las.LASReader_GetNextPoint(self.handle) 
      if not p: 
       self.at_end = True

來源

2012-10-08 14:31:24 halex

感謝halex，我和liblas有着非常不好的時刻昨天和今天我正在嘗試閱讀chunck但lasfile.File類型打破了所有的迭代器慣例我遵循谷歌中的所有例子，但總是有一個新問題請參閱： http://stackoverflow.com/questions/12769353/python-suggestions-to-改進塊大小的代碼讀取數百萬的 –

這是一個非常有趣的設計。Howeve r，你應該能夠用盡迭代器，我仍然不明白爲什麼它循環而不是需要一個新的循環來重新啓動它... – mgilson

他們創建libals模塊以讀取Python中的* .las文件。 * .las文件是存儲稱爲LiDAR http://en.wikipedia.org/wiki/LIDAR的「激光數據」的特殊格式。las文件是一種ASPRS激光雷達數據交換格式，其中ASPRS是美國攝影測量和遙感協會 –

Python：用islice一次讀取N個行數的問題

回答

相關問題