Tesseract-OCR

Tesseract-OCR

Submitted by pigbar on Wed, 10/23/2013 - 16:01

This a simple connector for the well know Tesseract-OCR engine. It gets a simple not compressed TIF image file as input and produce the text for the given image file.

The connector works with the following parameters: The PATH for the tesseract-ocr engine, example: /usr/bin/tesseract , (c:\tools\tesseract.exe) The NAME of the variable for the attached TIF image file, example: ${myAttch.getName()} The NAME for the output file, example: myOcrFile The LANGUAGE for the TIF image file, example: en (English), es (Spanish), etc.

By now the connector only support uncompressed TIF image file, with a simple column text. This limitations comes from the tesseract-ocr version.

Category:

Other

Licence:

GPL v2

Repository URL:

http://community.bonitasoft.com/project/tesseract-ocr

Downloads