Trying to get the best result with imagemagic and tesseract OCR, image recognition
Posted: 2017-07-04T19:52:59-07:00
Hello I'm trying to use OCR tesseract to recognize some letters in a image.
I did a convert using imagemagick and image seems to be good but its not enough to recognize
The original images:
data:image/s3,"s3://crabby-images/8e175/8e17584f86f5aaa1a17c2faa081973c19cf1319f" alt="Image"
data:image/s3,"s3://crabby-images/76d72/76d721308b62ccc4f537c71debecc0f0c091e0a6" alt="Image"
data:image/s3,"s3://crabby-images/c85ec/c85ec90af5fafb99872f7223789ac22177e4aeef" alt="Image"
data:image/s3,"s3://crabby-images/9668a/9668acbbf6cbbb6496fece170ec411c46185f673" alt="Image"
The command used with imagemagick to convert(with a lot of thanks to @fmw42)
The result images:
data:image/s3,"s3://crabby-images/aa357/aa357097523236f1659a7fdf9c5dc216a0cf52c1" alt="Image"
data:image/s3,"s3://crabby-images/6cf8e/6cf8e3c3dfb4f4847b0ef5e23427d373df8e7bad" alt="Image"
data:image/s3,"s3://crabby-images/7232c/7232c3018ab8bd248f4ea395973a66c12a636644" alt="Image"
data:image/s3,"s3://crabby-images/4a63f/4a63f72880cf0c6167e01f11dfef5bf569c5150d" alt="Image"
The OCR tesseract command:
Output/result:
Text: AUGU -> AUOU
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica Page 1
Text: VEGU -> VOR-OU
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica Page 1
Text: EGUV -> E6UV
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica Page 1
Text: USEA -> USSOEA
May I can use some filter to get better results? what do you think?
I did a convert using imagemagick and image seems to be good but its not enough to recognize
The original images:
data:image/s3,"s3://crabby-images/8e175/8e17584f86f5aaa1a17c2faa081973c19cf1319f" alt="Image"
data:image/s3,"s3://crabby-images/76d72/76d721308b62ccc4f537c71debecc0f0c091e0a6" alt="Image"
data:image/s3,"s3://crabby-images/c85ec/c85ec90af5fafb99872f7223789ac22177e4aeef" alt="Image"
data:image/s3,"s3://crabby-images/9668a/9668acbbf6cbbb6496fece170ec411c46185f673" alt="Image"
The command used with imagemagick to convert(with a lot of thanks to @fmw42)
Code: Select all
convert input.jpg -fuzz 50% -fill black -opaque black -bordercolor white -border 2 -fill black -draw "color 0,0 floodfill" -alpha off -negate -units pixelsperinch -density 72 output.jpg
data:image/s3,"s3://crabby-images/aa357/aa357097523236f1659a7fdf9c5dc216a0cf52c1" alt="Image"
data:image/s3,"s3://crabby-images/6cf8e/6cf8e3c3dfb4f4847b0ef5e23427d373df8e7bad" alt="Image"
data:image/s3,"s3://crabby-images/7232c/7232c3018ab8bd248f4ea395973a66c12a636644" alt="Image"
data:image/s3,"s3://crabby-images/4a63f/4a63f72880cf0c6167e01f11dfef5bf569c5150d" alt="Image"
The OCR tesseract command:
Code: Select all
$ tesseract output.jpg out -psm 7
Text: AUGU -> AUOU
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica Page 1
Text: VEGU -> VOR-OU
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica Page 1
Text: EGUV -> E6UV
Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica Page 1
Text: USEA -> USSOEA
May I can use some filter to get better results? what do you think?