2017-06-24 194 views
1

我正在使用tesseract在screengrabs上執行OCR。我有一個應用程序使用tkinter窗口利用self.after在我的類的初始化執行不斷的圖像擦除和更新tkinter窗口中的標籤等值。我搜索了多天,並找不到任何具體的例子如何利用CREATE_NO_WINDOW與Python3.6在Windows平臺上調用pytesseract tesseract。當我使用CREATE_NO_WINDOW與pytesseract運行tesseract時如何隱藏控制檯窗口

這涉及到這樣一個問題:

How can I hide the console window when I run tesseract with pytesser

我只持續2周編程Python和不明白/如何執行在上述問題中的步驟。我打開了pytesseract.py文件並檢查並找到了proc = subprocess.Popen(command,stderr = subproces.PIPE)行,但是當我嘗試編輯它時,我得到了一堆我無法弄清楚的錯誤。

#!/usr/bin/env python 

''' 
Python-tesseract. For more information: https://github.com/madmaze/pytesseract 

''' 

try: 
    import Image 
except ImportError: 
    from PIL import Image 

import os 
import sys 
import subprocess 
import tempfile 
import shlex 


# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY 
tesseract_cmd = 'tesseract' 

__all__ = ['image_to_string'] 


def run_tesseract(input_filename, output_filename_base, lang=None, boxes=False, 
        config=None): 
    ''' 
    runs the command: 
     `tesseract_cmd` `input_filename` `output_filename_base` 

    returns the exit status of tesseract, as well as tesseract's stderr output 

    ''' 
    command = [tesseract_cmd, input_filename, output_filename_base] 

    if lang is not None: 
     command += ['-l', lang] 

    if boxes: 
     command += ['batch.nochop', 'makebox'] 

    if config: 
     command += shlex.split(config) 

    proc = subprocess.Popen(command, stderr=subprocess.PIPE) 
    status = proc.wait() 
    error_string = proc.stderr.read() 
    proc.stderr.close() 
    return status, error_string 


def cleanup(filename): 
    ''' tries to remove the given filename. Ignores non-existent files ''' 
    try: 
     os.remove(filename) 
    except OSError: 
     pass 


def get_errors(error_string): 
    ''' 
    returns all lines in the error_string that start with the string "error" 

    ''' 

    error_string = error_string.decode('utf-8') 
    lines = error_string.splitlines() 
    error_lines = tuple(line for line in lines if line.find(u'Error') >= 0) 
    if len(error_lines) > 0: 
     return u'\n'.join(error_lines) 
    else: 
     return error_string.strip() 


def tempnam(): 
    ''' returns a temporary file-name ''' 
    tmpfile = tempfile.NamedTemporaryFile(prefix="tess_") 
    return tmpfile.name 


class TesseractError(Exception): 
    def __init__(self, status, message): 
     self.status = status 
     self.message = message 
     self.args = (status, message) 


def image_to_string(image, lang=None, boxes=False, config=None): 
    ''' 
    Runs tesseract on the specified image. First, the image is written to disk, 
    and then the tesseract command is run on the image. Tesseract's result is 
    read, and the temporary files are erased. 

    Also supports boxes and config: 

    if boxes=True 
     "batch.nochop makebox" gets added to the tesseract call 

    if config is set, the config gets appended to the command. 
     ex: config="-psm 6" 
    ''' 

    if len(image.split()) == 4: 
     # In case we have 4 channels, lets discard the Alpha. 
     # Kind of a hack, should fix in the future some time. 
     r, g, b, a = image.split() 
     image = Image.merge("RGB", (r, g, b)) 

    input_file_name = '%s.bmp' % tempnam() 
    output_file_name_base = tempnam() 
    if not boxes: 
     output_file_name = '%s.txt' % output_file_name_base 
    else: 
     output_file_name = '%s.box' % output_file_name_base 
    try: 
     image.save(input_file_name) 
     status, error_string = run_tesseract(input_file_name, 
              output_file_name_base, 
              lang=lang, 
              boxes=boxes, 
              config=config) 
     if status: 
      errors = get_errors(error_string) 
      raise TesseractError(status, errors) 
     f = open(output_file_name, 'rb') 
     try: 
      return f.read().decode('utf-8').strip() 
     finally: 
      f.close() 
    finally: 
     cleanup(input_file_name) 
     cleanup(output_file_name) 


def main(): 
    if len(sys.argv) == 2: 
     filename = sys.argv[1] 
     try: 
      image = Image.open(filename) 
      if len(image.split()) == 4: 
       # In case we have 4 channels, lets discard the Alpha. 
       # Kind of a hack, should fix in the future some time. 
       r, g, b, a = image.split() 
       image = Image.merge("RGB", (r, g, b)) 
     except IOError: 
      sys.stderr.write('ERROR: Could not open file "%s"\n' % filename) 
      exit(1) 
     print(image_to_string(image)) 
    elif len(sys.argv) == 4 and sys.argv[1] == '-l': 
     lang = sys.argv[2] 
     filename = sys.argv[3] 
     try: 
      image = Image.open(filename) 
     except IOError: 
      sys.stderr.write('ERROR: Could not open file "%s"\n' % filename) 
      exit(1) 
     print(image_to_string(image, lang=lang)) 
    else: 
     sys.stderr.write('Usage: python pytesseract.py [-l lang] input_file\n') 
     exit(2) 


if __name__ == '__main__': 
    main() 

我利用的代碼是在類似的問題類似的例子:

def get_string(img_path): 
    # Read image with opencv 
    img = cv2.imread(img_path) 
    # Convert to gray 
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 
    # Apply dilation and erosion to remove some noise 
    kernel = np.ones((1, 1), np.uint8) 
    img = cv2.dilate(img, kernel, iterations=1) 
    img = cv2.erode(img, kernel, iterations=1) 
    # Write image after removed noise 
    cv2.imwrite(src_path + "removed_noise.png", img) 
    # Apply threshold to get image with only black and white 
    # Write the image after apply opencv to do some ... 
    cv2.imwrite(src_path + "thres.png", img) 
    # Recognize text with tesseract for python 

    result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) 

    return result 

當它到達下面的行,有一個黑色的控制檯窗口的閃光不足秒,然後它在運行命令時關閉。

result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) 

這裏是控制檯窗口的畫面:

Program Files (x86)_Tesseract

這裏是從另一個問題建議:

You're currently working in IDLE, in which case I don't think it really matters if a console window pops up. If you're planning to develop a GUI app with this library, then you'll need to modify the subprocess.Popen call in pytesser.py to hide the console. I'd first try the CREATE_NO_WINDOW process creation flag. – eryksun

我將不勝感激如何任何幫助使用CREATE_NO_WINDOW修改pytesseract.py庫文件中的subprocess.Popen調用。我也不確定pytesseract.py和pytesser.py庫文件之間的區別。我會留下對其他問題的評論,要求澄清,但我不能直到我在這個網站上有更多的聲望。

回答

3

我做更多的研究,並決定進一步瞭解subprocess.Popen:

Documentation for subprocess

我也引用下面的文章:

using python subprocess.popen..can't prevent exe stopped working prompt

我改變的代碼原線在pytesseract。潘岳:

proc = subprocess.Popen(command, stderr=subprocess.PIPE) 

以下幾點:

proc = subprocess.Popen(command, stderr=subprocess.PIPE, creationflags = CREATE_NO_WINDOW) 

我跑的代碼,並得到了以下錯誤:

Exception in Tkinter callback Traceback (most recent call last):
File "C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\tkinter__init__.py", line 1699, in call return self.func(*args) File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line 403, in gather_data update_cash_button() File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line 208, in update_cash_button currentCash = get_string(src_path + "cash.png") File "C:\Users\Steve\Documents\Stocks\QuickOrder\QuickOrderGUI.py", line 150, in get_string result = pytesseract.image_to_string(Image.open(src_path + "thres.png")) File "C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py", line 125, in image_to_string config=config) File "C:\Users\Steve\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pytesseract\pytesseract.py", line 49, in run_tesseract proc = subprocess.Popen(command, stderr=subprocess.PIPE, creationflags = CREATE_NO_WINDOW) NameError: name 'CREATE_NO_WINDOW' is not defined

然後我定義的CREATE_NO_WINDOW變量:

#Assignment of the value of CREATE_NO_WINDOW 
CREATE_NO_WINDOW = 0x08000000 

我得到了0x08000000的值從上面鏈接的文章。在添加了定義之後,我運行了應用程序,並且沒有再獲得控制檯窗口彈出窗口。

相關問題