Can we extract as an image blocks of text?
Posted: 2018-03-18T14:24:50-07:00
Hi,
my final goal is to cut words from a scanned text as images, make Tesseract find the text that is within them, and rename the image files containing the words with their content.
data:image/s3,"s3://crabby-images/242dc/242dc154e5211e410695daf71d0a6ef99132cf95" alt="Image"
For the first line of this image, there would be a first image containing FRUIT then another containing VINES and so on with a random name. If that is possible, I would then give each image to tesseract and rename each image file; the first one would be called FRUIT.png, the second VINES.png and so on.
I would then be able to rearrange text to form once more groups of words (FRUIT VINES) as images.
Do you think the first step could be done with ImageMagick?
Thanks a lot.
my final goal is to cut words from a scanned text as images, make Tesseract find the text that is within them, and rename the image files containing the words with their content.
data:image/s3,"s3://crabby-images/242dc/242dc154e5211e410695daf71d0a6ef99132cf95" alt="Image"
For the first line of this image, there would be a first image containing FRUIT then another containing VINES and so on with a random name. If that is possible, I would then give each image to tesseract and rename each image file; the first one would be called FRUIT.png, the second VINES.png and so on.
I would then be able to rearrange text to form once more groups of words (FRUIT VINES) as images.
Do you think the first step could be done with ImageMagick?
Thanks a lot.