Page 1 of 1
OCR and specific regions
Posted: 2010-05-23T17:43:59-07:00
by galv
I'm using Linux and Mac OS X and I want to use a free OCR to identify a specific word from a screenshot. I want to apply an IM filter at that specific area. I know about "-draw" but how would I get the exact dimensions of the rectangle from the OCR program?
I've heard about gocr, ocrad, tesseract but never used them. Anyone has a solution or ideas?
Re: OCR and specific regions
Posted: 2010-05-23T19:35:28-07:00
by fmw42
probably should post to an OCR list for information about OCR box size
Re: OCR and specific regions
Posted: 2010-05-23T20:02:11-07:00
by galv
Which free OCR software do you guys recommend?
Re: OCR and specific regions
Posted: 2010-05-23T22:26:43-07:00
by fmw42
I am on a Mac and have used ReadIrisPro, but I don't recall that it was free.
Re: OCR and specific regions
Posted: 2010-05-24T00:24:04-07:00
by anthony
I experiments with using Gocr with some screen shots, and found it did not work very well, even though the captured text was perfect and clear of noise.
The problem I figured out was that the OCR was optimized for Scanned documents at 300 to 600 dpi rather that perfect screen captures at 90 to 120 dpi. When I scaled or resized the image to a higher resolution I had more success.
I really miss the old days on my Commodore 64 and Amiga which had software that could look in a screen boxed text and tell you exactly what the text was for copy and paste. But than that knew exact what font was being used and could match up the symbols perfectly.
Perhaps if you know the font being used you could DIY a solution by doing a morphology matching operation on the boxed text. that is segment the box into letters to find the 'grid' being used, and then match up the letter in each box. A Screen resolution text capture would work well in this form.
And please let us know what you discover and find out. People are interested but few report back their findings.
Re: OCR and specific regions
Posted: 2010-05-30T11:30:06-07:00
by Wolfgang Woehl
tesseract is quite ok. Here's what it outputs from a screenshot of this thread:
$ tesseract
tesseract:Error:Usage:tesseract imagename outputbase [-l lang] [configfile [[+|-]varfile]...]
$ tesseract english.tif text -l eng
$ cat text.txt
---
OCR and specific regions
POSTFIEPLY Ié io Search this topic". Search
OCR and specific regions l?°°”°'E I
Iby gaw » 2010-05-Z4T01:43:59+00:00
l'm using Linux and Mac OS X and I want to use a free OCR to identify a specific word from a screenshot. I want to apply an IM filter at
that specific area. I know about "-draw" but how would I get the exact dimensions of the rectangle from the OCR program?
I've heard about gocr, ocrad, tesseract but never used them. Anyone has a solution or ideas?
Re: OCR and specific regions “¤¤¤TE I
I by rmwaz » 2010-05-Z4T03:35:28+00:0O
probably should post to an OCR list for information about OCR box size
Re: OCR and specific regions .
Iby gaw » 2010-05-Z4T04:0Z:11+00:00
Which free OCR software do you guys recommend?
---
No layout analysis etc.