2016-10-13 20 views
1

我有大量的文件,我需要遍歷它們並搜索一些字符串,當找到字符串時,文件被複制到新的文件夾,否則關閉。在新類中讀取和關閉大量文件導致OSError:打開的文件太多

下面是示例代碼:

import os 
import stringsfilter 


def apply_filter(path, filter_dict): 
    dirlist = os.listdir(path) 
    for directory in dirlist: 
     pwd = path + '/' + directory 
     filelist = os.listdir(pwd) 
     for filename in filelist: 
      if filename.split('.')[-1] == "stats": 
       sfilter = stringsfilter.StringsFilter(pwd, filename, filter_dict["strings"]) 
       sfilter.find_strings_and_move() 

這裏是stringsfilter.py:

import main 
import codecs 
import os 
import shutil 


class StringsFilter: 

    strings = None 

    def __init__(self, filepath, filename, strings): 
     self.filepath = filepath 
     self.filename = filename 
     self.strings = strings 
     self.logger = main.get_module_logger("StringsFilter") 
     self.file_desc = codecs.open(self.filepath + '/' + self.filename, 'r', encoding="utf-8-sig") 
     self.logger.debug("[-] Strings: " + str(self.strings)) 
     self.logger.debug("[-] Instantiating class Strings Filter, filename: %s " % self.filename) 

    def find_strings_and_move(self): 
     for line in self.file_desc.readlines(): 
      for string in self.strings: 
       if string in line: 
        self.move_to_folder() 
        return 
     self.close() 

    def move_to_folder(self): 
     name = self.filename.split('.')[0] 
     os.mkdir(self.filepath + '/' + name) 
     shutil.copyfile(self.filepath + '/' + self.filename, 
         self.filepath + '/' + name + '/' + self.filename) 
     self.close() 

    def close(self): 
     if self.file_desc: 
      self.logger.debug("[-] Closing file %s" % self.filename) 
      self.file_desc.close() 

main.py:

import logging 

def get_module_logger(name): 
    # create logger 
    logger = logging.getLogger(name) 

    # set logging level to log everything 
    logger.setLevel(logging.DEBUG) 

    # create file handler which logs everything 
    fh = logging.FileHandler('files.log') 
    fh.setLevel(logging.DEBUG) 

    # create console handler 
    ch = logging.StreamHandler() 
    ch.setLevel(logging.INFO) 

    # create formatter and add it to the handlersi 
    formatter = logging.Formatter('[%(asctime)s] [%(name)-17s] [%(levelname)-5s] - %(message)s') 
    fh.setFormatter(formatter) 
    ch.setFormatter(formatter) 

    # add the handlers to the logger 
    logger.addHandler(fh) 
    logger.addHandler(ch) 
    return logger 

在日誌中我可以看到以下內容:

[2016-10-13 10:07:07,002] [StringsFilter ] [DEBUG] - [-] Strings: ['DEVICE_PROBLEM'] 
[2016-10-13 10:07:07,002] [StringsFilter ] [DEBUG] - [-] Instantiating class Strings Filter, filename: file1.stats 
[2016-10-13 10:07:07,003] [StringsFilter ] [DEBUG] - [-] Closing file file1.stats 
[2016-10-13 10:07:07,003] [StringsFilter ] [DEBUG] - [-] Strings: ['DEVICE_PROBLEM'] 
[2016-10-13 10:07:07,003] [StringsFilter ] [DEBUG] - [-] Strings: ['DEVICE_PROBLEM'] 
[2016-10-13 10:07:07,004] [StringsFilter ] [DEBUG] - [-] Instantiating class Strings Filter, filename: file2.stats 
[2016-10-13 10:07:07,004] [StringsFilter ] [DEBUG] - [-] Instantiating class Strings Filter, filename: file2.stats 
[2016-10-13 10:07:07,004] [StringsFilter ] [DEBUG] - [-] Closing file file2.stats 
[2016-10-13 10:07:07,004] [StringsFilter ] [DEBUG] - [-] Closing file file2.stats 
[2016-10-13 10:07:07,005] [StringsFilter ] [DEBUG] - [-] Strings: ['DEVICE_PROBLEM'] 
[2016-10-13 10:07:07,005] [StringsFilter ] [DEBUG] - [-] Strings: ['DEVICE_PROBLEM'] 
[2016-10-13 10:07:07,005] [StringsFilter ] [DEBUG] - [-] Strings: ['DEVICE_PROBLEM'] 
[2016-10-13 10:07:07,005] [StringsFilter ] [DEBUG] - [-] Instantiating class Strings Filter, filename: file3.stats 
[2016-10-13 10:07:07,005] [StringsFilter ] [DEBUG] - [-] Instantiating class Strings Filter, filename: file3.stats 
[2016-10-13 10:07:07,005] [StringsFilter ] [DEBUG] - [-] Instantiating class Strings Filter, filename: file3.stats 
[2016-10-13 10:07:07,006] [StringsFilter ] [DEBUG] - [-] Closing file file3.stats 
[2016-10-13 10:07:07,006] [StringsFilter ] [DEBUG] - [-] Closing file file3.stats 
[2016-10-13 10:07:07,006] [StringsFilter ] [DEBUG] - [-] Closing file file3.stats 

而且如此下去,好像每一次迭代中,從初始化每條語句都做一次,直到有太多打開的文件和程序與

OSError: [Errno 24] Too many files open 

我無法理解的結束,爲什麼從init開始的語句每次創建實例時都會被調用多次。

+0

收到錯誤'「主」在'stringsfilter.py'線9未defined'。 –

+0

很抱歉忘了main.py。它現在被添加。 – Libor

回答

0

原因,你有同樣的事情多次記錄: 每次main.get_module_logger("StringsFilter")叫,你叫logger.addHandler(...)同一記錄logging.getLogger(name)回來,讓你在一個記錄器獲得多個處理程序。最好讓模塊級記錄

import ... 
LOG = main.get_module_logger("StringsFilter") 
class StringsFilter:... 

對於打開的文件,我看不到的原因,但考慮find_strings_and_move()使用with open(filename) as f:語法。

LOG = main.get_module_logger("StringsFilter") 
class StringsFilter: 

    strings = None 

    def __init__(self, filepath, filename, strings): 
     self.filepath = filepath 
     self.filename = filename 
     self.strings = strings 
     LOG.debug("[-] Strings: " + str(self.strings)) 
     LOG.debug("[-] Instantiating class Strings Filter, filename: %s " % self.filename) 

    def find_strings_and_move(self): 
     with open(self.filepath + '/' + self.filename, 'r') as file_desc: 
      lines = file_desc.readlines() 
     for line in lines: 
      for string in self.strings: 
       if string in line: 
        self.move_to_folder() 
        return 

    def move_to_folder(self): 
     name = self.filename.split('.')[0] 
     os.mkdir(self.filepath + '/' + name) 
     shutil.copyfile(self.filepath + '/' + self.filename, 
         self.filepath + '/' + name + '/' + self.filename) 

這樣,您確保文件被關閉1)招2)之前始終

+0

謝謝。你的解決方案工作正常我沒有意識到我在重複StringsFilter類中的日誌記錄。此外,我可能沒有正確關閉文件,因爲現在使用with語句不會以Error結束:打開的文件過多。 – Libor

+0

@Libor我很高興它的工作:)請考慮接受/ upvoting答案。 – MateuszL

相關問題