Page 1 of 1

Cleaning up noise around text

Posted: 2015-01-28T19:07:08-07:00
by g7Bond
Hi!

I've been trying to clear the background of the kind of image that you can see below.

Image
Image
Image

The process I'm doing is that first I'll run a simple filter (hand made) to remove some of the noise (picking only black pixels that are surrounded by 8 other black pixels): https://github.com/vkruoso/receita-tool ... aFilter.py - After that I just run tesseract hoping the result will be good.

I'm providing a free webservice that get information from a government site to allow an easier way to have the information (this really should be provided by the government). Doing that process I've managed to successfully decode the text 25% of the time. But that's not good enough to provide a good service.

I have very little background on image processing, so I think someone around here can give some hints about how to approach on this particular kind of image.

--
Thanks a lot.

Re: Cleaning up noise around text

Posted: 2015-01-28T19:17:32-07:00
by fmw42
Most people here will not help to break captchas