Complex autocropping and splitting up grouped characters
Posted: 2012-06-14T15:34:39-07:00
I need to autocrop a set of images, clean them up and pass them on to another script to interpret them.
I have reviewed this page: http://studio.imagemagick.org/discourse ... =1&t=21112 re a recent discussion related to autocropping. It was very informative.
Assumptions:
- High res photos
- Characters to be cropped will contrast with their background (black on white, black on yellow etc)
- There may be more then one character set per photo but if its simpler, just assume theres only one for now.
Here is an example image I am working with:
1. I think I need to define an edge around the high contrast characters like this:

2. and then crop:

3. After that I need to grayscale and adjust contrast of the images for readability. I think I have that down with this:
4. Finally, I need to split up these images into single characters:

5. I will then convert the image to a txt file and translate into bitmap's such as this and compare them to a bitmap dataset I have:
11111111111110000000111111111110
11111111111100000000011111111110
11111111100000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000001111111110
11000000000000000000001111111110
11000000000000000000001111111110
11111110000000000000001111111110
11111110000000000000001111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111111000000000000000111111110
11111111000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111111000000000000001111111110
11111111000000000000001111111110
11111111000000000000000000011110
11110000000000000000000000001110
11100000000000000000000000000110
11100000000000000000000000000110
11100000000000000000000000000010
11110000000000000000000000000010
11110000000000000000000000000010
11110000000000000000000000011110
I need help with steps 1,2 and 4. Does anyone have any insights?
I have reviewed this page: http://studio.imagemagick.org/discourse ... =1&t=21112 re a recent discussion related to autocropping. It was very informative.
Assumptions:
- High res photos
- Characters to be cropped will contrast with their background (black on white, black on yellow etc)
- There may be more then one character set per photo but if its simpler, just assume theres only one for now.
Here is an example image I am working with:

1. I think I need to define an edge around the high contrast characters like this:

2. and then crop:

3. After that I need to grayscale and adjust contrast of the images for readability. I think I have that down with this:
Code: Select all
convert file.jpg -colorspace Gray file_grey.jpg
Code: Select all
convert file_grey.jpg -brightness-contrast -30x30 file_contrasted.jpg

5. I will then convert the image to a txt file and translate into bitmap's such as this and compare them to a bitmap dataset I have:
11111111111110000000111111111110
11111111111100000000011111111110
11111111100000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000011111111110
11000000000000000000001111111110
11000000000000000000001111111110
11000000000000000000001111111110
11111110000000000000001111111110
11111110000000000000001111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111111000000000000000111111110
11111111000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111110000000000000000111111110
11111111000000000000001111111110
11111111000000000000001111111110
11111111000000000000000000011110
11110000000000000000000000001110
11100000000000000000000000000110
11100000000000000000000000000110
11100000000000000000000000000010
11110000000000000000000000000010
11110000000000000000000000000010
11110000000000000000000000011110
I need help with steps 1,2 and 4. Does anyone have any insights?