Complex autocropping and splitting up grouped characters

up_and_up · Post by **up_and_up** » 2012-06-14T15:34:39-07:00

I need to autocrop a set of images, clean them up and pass them on to another script to interpret them.

I have reviewed this page: http://studio.imagemagick.org/discourse ... =1&t=21112 re a recent discussion related to autocropping. It was very informative.

Assumptions:
- High res photos
- Characters to be cropped will contrast with their background (black on white, black on yellow etc)
- There may be more then one character set per photo but if its simpler, just assume theres only one for now.

Here is an example image I am working with:

1. I think I need to define an edge around the high contrast characters like this:

2. and then crop:

3. After that I need to grayscale and adjust contrast of the images for readability. I think I have that down with this:

Code: Select all

convert file.jpg -colorspace Gray file_grey.jpg

Code: Select all

convert file_grey.jpg -brightness-contrast -30x30 file_contrasted.jpg

4. Finally, I need to split up these images into single characters:

5. I will then convert the image to a txt file and translate into bitmap's such as this and compare them to a bitmap dataset I have:

11111111111110000000111111111110
11111111111100000000011111111110
11111111100000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000001111111110
11000000000000000000001111111110
11000000000000000000001111111110
11111110000000000000001111111110
11111110000000000000001111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111111000000000000000111111110
11111111000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111111000000000000001111111110
11111111000000000000001111111110
11111111000000000000000000011110
11110000000000000000000000001110
11100000000000000000000000000110
11100000000000000000000000000110
11100000000000000000000000000010
11110000000000000000000000000010
11110000000000000000000000000010
11110000000000000000000000011110

I need help with steps 1,2 and 4. Does anyone have any insights?

Post by **fmw42** » 2012-06-14T16:08:56-07:00

Step 4 is outline in the link you specified. You just have to convert to grayscale and resize the image down to one row and look for the brightest white breaks.

Steps 1 and 2 are very hard. How will you for example expect any software to find those numbers when there is just a similar high contrast red/blue on white background on the shirt right above it. If you can find the area, then you don't need to put any boxes around it. It is finding the right areas and recognizing that they are letters or numbers that is the hard part. I really don't have any idea how one would do that sort of thing, especially as the numbers are distorted by rotations and folds in the shirts. Furthermore, one does not know how big the numbers will be.

I just do not think IM has enough of such tools to really help. But I will defer to others who might have some good ideas.

Post by **anthony** » 2012-06-14T16:51:34-07:00

Rather than look for numbers, look for 'white patches' with 'red contents'. Anything else is extra.

If a white patch is not big enough (easy), or not square enough (hard), then ignore it.

For get about separating digits or recognising them. Just extract them, filter for high contrast, and feed the image into OCR software such as: ocrad, gocr, and tesseract (to name a few free ones) to translate them to numbers.

If you don't get numbers 'flag' the image for human review.

PS: you know the 'catchpa' (I am human) tests you see with two words are actually doing OCR improvement.
One word is 'known' (the verification of your humanity), the other is 'unknown' (OCR rejected). If enough humans get the second word the same, they then can convert that 'unknown' word to 'known'

up_and_up · Post by **up_and_up** » 2012-06-14T21:26:00-07:00

Thank you both very much for taking the time to respond to this. I really respect your time and authority on these subjects.

The above was not my first approach at it.

Initially, I was feeding greyscale images into tesseract and getting about 30-50% correct. Issues I was running into:

1. Smaller numbers on a big image generally didn't work.
2. Numbers that were less solid in color, such as a number that is folded slightly, slight shadows or sunlight were not correctly interpreted by tesseract.

I do think I can still use tesseract, but need to be able to crop around the white areas and flatten the color a bit.

I was planning for a conditional multi-step process:

1. Use tesseract for OCR
2. If tesseract yields nothing then try the bitmap comparison
3. If still nothing flag for human review

In both 1 and 2 above I would still benefit from cropping in and filter for high contrast.

fmw42 - you make good points. Thanks for the pointer re step 4 (splitting the numbers). I guess my only idea around this was taking some input, like font color and direct background color and searching the image for that combination. So for example, using a color picker, getting a user to select what the color of the font is and the color of the square background. If I chose that route, would that make it easy to search the document for the particular combination? In this case red on white for example? I'm guessing that IM can detect a particular color or a similar palette and detect the color around that color, I hope I am being clear. Say you had a red square inside a white circle and a bunch of red circles on the same image. Assuming I give you a approx RGB values for the red and white color, would IM be able to detect the red square inside a white circle. Even though those same colors may exist in the image but not in that combination? See here:

anthony - Thanks for the advice. "white patch is not big enough" - I assume you mean looking for blocks of color with a certain range? See above for a description of what I am now thinking. Lets say I have the approximate color of the font and background, lets say red on white. Do you have any existing examples of some script that would:
1. Look for a chunk of color (with an approx range) which is large than a minimum pixel size.
2. Check whether it contains a second color, again within a approximate range.
3. Crop around the background color.

For the OCR part, we are thinking a long the same lines though. I had already tried tesseract out which yielded mixed results. I need to better preprocess the images, hence the above questions, thoughts etc. I was also considering a machine learning component of some sort, for correct matches.

Thanks very much for your time and input!

Post by **fmw42** » 2012-06-14T22:11:38-07:00

The red circle/square is certainly much easier, but there are still issue. IM can convert all the red items as white on a black background. From there my script, separate, can locate all the white( formerly red) items and tell you where they are located and give you each image. See my script separate at the link at the very bottom to my web site.

The problem is that IM has no direct way to tell you which is square and which are circles. One could do a polar transformation on each image and look at its structure. If you have a circle and do a polar transformation, you will get an image with a nice flat horizontal separator between the red and the outer white. With a square, the polar transformation would would have peaks and valleys. see -distort depolar at http://www.imagemagick.org/Usage/distorts/#depolar.

The other way, it to take each white on black image and then do some statistical analysis for the shape. The main concept has to do with image moments. But IM does not have those coded. See
http://en.wikipedia.org/wiki/Image_moment
http://www.aishack.in/2011/06/image-moments/
http://opencv.itseez.com/doc/tutorials/ ... ments.html.

IM has some simple statistics of skewness and kurtosis in its verbose information. I did one simple thing on an arbitrary shape to get its area. See http://www.fmwconcepts.com/imagemagick/ ... shape_mean. Here for example you have a binary ellipse. IM can give you the mean, which converts to the total number of white pixels. But it can give you standard deviation, skewness, kurtosis.

identify -verbose shape_ellipse_mask.gif
Image: shape_ellipse_mask.gif
...
Channel statistics:
Gray:
min: 0 (0)
max: 255 (1)
mean: 68.1615 (0.2673)
standard deviation: 112.85 (0.44255)
kurtosis: -0.89407
skewness: 1.05163

But I would look into OpenCV for many more image analysis functions, such as hough circle or line transformations. I don't know for sure, but I would not be surprised if it did not do image moments. I think also that ImageJ has some plugins for image moments. (see http://www.google.com/url?sa=t&rct=j&q= ... 2LkuZ2jIMg)

For example, you could take your circle and square images and get an edge outline and use the Hough line transform to tell if the it is a square or use the Hough circle transform directly to find all circles. Search Google for Hough transform.

However, none of this tells you about your numerals. However, the image moments, which are rotation and scale invariant could do that in theory for each digit 0-9. One would have to compute the moments for some standard digit variations and find the general characteristics. Then do the same on your numeral images and compare to the reference stats to pick out the number.

I have never really done much directly with image moments other than read about them, but perhaps a Google search will give you more information and perhaps someone has written a paper about distinguishing numerals or alphabetic characters by image moments.

Fred

Post by **anthony** » 2012-06-14T22:31:47-07:00

fmw42 wrote:The other way, it to take each white on black image and then do some statistical analysis for the shape. The main concept has to do with image moments. But IM does not have those coded. See
http://en.wikipedia.org/wiki/Image_moment
http://www.aishack.in/2011/06/image-moments/
http://opencv.itseez.com/doc/tutorials/ ... ments.html.

Actually IM does have something along those lines, I just don't understand them

The operation is -features and -unique
and they produce verbose identify output.

up_and_up · Post by **up_and_up** » 2012-06-15T09:13:46-07:00

Thanks very much for your replies. They are very helpful. I will report back if I discover anything new.

Post by **fmw42** » 2012-06-15T10:02:41-07:00

anthony wrote:
Actually IM does have something along those lines, I just don't understand them

The operation is -features and -unique
and they produce verbose identify output.

-features is interesting, but not quite the same as it is directional and distance dependent. But it does have quite a few measures including some moments and co-occurrence matrices. Never knew it was there. These are all local texture measures defined by the distance. Does it output an image where each pixel represents the feature metric in its local vicinity or just an average metric over the whole image. I will have to explore this some.

I do not see any -unique command on the options page.

Post by **fmw42** » 2012-06-15T11:33:18-07:00

anthony wrote:
Actually IM does have something along those lines, I just don't understand them

The operation is -features and -unique
and they produce verbose identify output.

I have tried several commands and cannot figure out the syntax. What is the proper syntax?

convert shape_ellipse_mask.gif -features 10 null:
convert: unrecognized option `-features' @ error/convert.c/ConvertImageCommand/1570.

convert shape_ellipse_mask.gif -features 10 -verbose info:
convert: unrecognized option `-features' @ error/convert.c/ConvertImageCommand/1570.

Post by **anthony** » 2012-06-17T18:16:10-07:00

No idea! I think it may be for identify. It is in the Change logs.

Legacy ImageMagick Discussions Archive

Complex autocropping and splitting up grouped characters

Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters

Re: Complex autocropping and splitting up grouped characters