pytesseract和image.tif文件

我需要使用pytesseract將幾個頁面的image.tif轉錄爲文本。我有下面的代碼：pytesseract和image.tif文件

> From PIL import Image 
> Import pytesseract 
> Pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract- 
> OCR/tesseract ' 
> Print (pytesseract.image_to_string (Image.open ('CAMARA.tif'), lang = "spa"))

的問題是，只提取冷杉頁面。我如何提取所有這些？

來源

2017-07-25 Andrés

我想你只提到過一個圖像「camara.tif」，首先你必須將所有的pdf頁面轉換成圖像，你可以看到這個link這樣做。

接下來使用pytesseract逐個遍歷圖像以從圖像中提取文本。

來源

2017-09-12 04:38:18

pytesseract和image.tif文件

回答

相關問題