OCR for scanned documents

Hi Frank; I'm currently using Tesseract API from xharbour. I'm processing thousands of .tif scanned documents. Results are about 80% accurate. I need better than that. For tesseract to be more accurate for the type of documents I'm OCRing I would need to change psm mode to 3 -which is default from command line. Changing PSM to 3 from API causes the OCR engine to break with runtime error. It might work a few times but after a number of runs it breaks causing my Harbour program to stop working. Just FYI- these documents contain a unique identifier that matches an account number for a customer on the database. In this way the documents are automatically indexed and saved into the customer's file without human intervention. Thousands of documents are feed into a commercial scanner each day and they end up stored on a blob field with the customer's account on another indexed char field. 80% accuracy means that 20% of the account numbers weren't read and thus we need a human opening these document to attach them to the correct customer. If you are interested on how to use Tesseract API from (x)Harbour, I will gladly provide source samples for you to try it. I'd love to solve the problem of not being able to change psm mode to 3 for more accuracy with my documents. Maybe you can help. Reinaldo.

OCR for scanned documents

Trending Articles

Halestorm – Everest – Pre-Single [iTunes Plus M4A]

Lady Gaga – MAYHEM (Bonus Tracks Version) [iTunes Rip M4A]

airbag nissan 28556ZP01A clear crash

The Ultimate Doors Discography - 90 Albums - All MP3's

NCERT Solutions for Class 9th Sanskrit Chapter 2 अविवेकः परमापदां पदम्

Practice Sheet of Right form of verbs for HSC Students

WALLACE; JACQUELINE

Read GOS (Generic Object Service) Picture Attachments and Display it into...

Schools benefit from American donation

In Court: Cases heard at Central Devon Magistrates' Court

Black Angus Grilled Artichokes

Aaron Haywood – Hyde

Griffith faces three more offences

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

Mp3 Download: Mdu - Mazola

Telangana TS New Food Security Card/ Telangana Ration card Application Form...

Trial of East Grinstead man accused of rape to begin next week

Theja Surapaneni The ‘Most Attractive' Man on Australian TV Of All Time

TBT: Samini “Tempo” Feat Mugeez (R2Bees) Prod by Kaywa

MS-CHAPV2 NAP Policy failing - Reason Code 65