Tesseract-OCR

This a simple connector for the well know Tesseract-OCR engine. It gets a simple not compressed TIF image file as input and produce the text for the given image file.

The connector works with the following parameters: The PATH for the tesseract-ocr engine, example: /usr/bin/tesseract , (c:\tools\tesseract.exe) The NAME of the variable for the attached TIF image file, example: ${myAttch.getName()} The NAME for the output file, example: myOcrFile The LANGUAGE for the TIF image file, example: en (English), es (Spanish), etc.

By now the connector only support uncompressed TIF image file, with a simple column text. This limitations comes from the tesseract-ocr version.

Category: 
Licence: 
GPL v2

Downloads

Notifications