Page 1 of 1

Help with filtering an image

Posted: 2010-05-29T13:10:40-07:00
by churrusco
Hi,

This is my first post. Sorry about it but I never used ImageMagick command line before so I'm not very used to the different options and possibilities.

Basically, I need to clean up an image to pass it through an OCR. This image has a gray background at the top that it's causing problems so I want to get rid of it. I need to get the text within the rectangles so hopefully filtering somehow the gray background will make the task easier. I tried different ways using the convert command with monochrome and threshold options but with no luck.

This is a sample image: http://img687.imageshack.us/img687/4535/facturam.png You can find at the very top the data I want to extract. "Factura" and date.

Any hints?

Re: Help with filtering an image

Posted: 2010-05-29T13:53:49-07:00
by snibgo
Median will help, eg "-median 1 -threshold 67%".

Re: Help with filtering an image

Posted: 2010-05-29T16:13:10-07:00
by fmw42
If you are on Unix, you can try my textcleaner script at the link below. I have not tried it on your image, yet.

P.S. Never mind, I just tried it and it does not help and is way too slow with such a large image.

Re: Help with filtering an image

Posted: 2010-05-30T00:15:55-07:00
by churrusco
Hi snibgo and fmw42, thanks for your answering back.

I tried with -median and threshold: "convert factura2.tif -median 1 -threshold 67% factura2b.tif" but unfortunately didn't worked as expected. In fact it returns the same result as I was getting by using only -threshold. Image turns completely black:

Image

Best.

Re: Help with filtering an image

Posted: 2010-05-30T07:25:07-07:00
by snibgo
Using the image you supplied above,

convert facturam.png -median 1 -threshold 67% f2.png

removes the required grey background, leaving the letters intact. This is on IM 6.6.0-8 in Windows 7.

What version IM are you on? If old, try an upgrade. What platform? If Windows using command files, don't forget to double the percent sign.

Re: Help with filtering an image

Posted: 2010-05-30T09:12:06-07:00
by churrusco
Hi snibgo,

I'm using IM 6.3.7 on Ubuntu Linux. It may also be that the original image file is a .tif (imageshack convers it to png when uploaded)

Updated: Yes, it is the fact of being a .tif image. Tried as you mentioned with the PNG image and it worked.

Martin

Re: Help with filtering an image

Posted: 2010-05-30T09:35:25-07:00
by snibgo
Your IM is old. I suggest you upgrade.

I converted the png to a tif, and it works for me:

Code: Select all

D:\web\im>"%IMG%convert" facturam.png facturam.tif
D:\web\im>"%IMG%convert" facturam.tif -median 1 threshold 67% f2.png
But your orignal tif may have some other issue. If you provide a URL to it, I can try it out.

Re: Help with filtering an image

Posted: 2010-05-31T00:40:46-07:00
by churrusco
No bother, I could not send the original because it has personal data within. But I get your point. Will try also to update IM.

Thanks very much for your help!