Page 1 of 2

Convert sometimes very slow with creating histogram

Posted: 2010-05-27T12:42:18-07:00
by bartgrefte
Hi all :)

I've created a batchfile that makes the work me and my colleagues need to do a lot easier.
Not sure what I am allowed to say, but basically we need to (amongst other things) check PNG's to see if they meet a bunch of requirements.
One of those requirements is that they must contain two or three specific colors, that (the used colors) plus all the other information we need to check is gathered with a batchfile that uses pngcheck.exe, ImageMagicks convert.exe and Swiss File Knife to do that.

When using convert to output a list of used colors, the output looks like:

Code: Select all

#   12293: (  0,  0,  0,255) #000000 black
#  48987707: (255,255,255,  0) #FFFFFF00 rgba

Command used:

Code: Select all

convert.exe "data\%%a" -format %%c histogram:info:- >> output.txt
The problem: the memory usage sometimes gets very high (400-500MB) and the command takes longer to finish.

At first glance, but haven't confirmed that, this seems to happen with PNG's that have more than 1 or 2 IDAT chunks.
Just thinking out loud, I think it's because the output contains the amount of pixels that are used by every color. Could it be that getting that kind of information could cause this problem?
If yes, how can I use convert in such a way that it only output's used colors and nothing more that could make convert take longer to finish?

With regards,

Bart

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T13:19:10-07:00
by fmw42
PNG files are 24-bits per pixel (8 bits per rgb channel) on Q8 (or 48-bits per pixel on Q16). Thus you have millions of colors. You can speed things up if you reduce the number of colors in your image and if you run Q8 IM.

convert image -colors 256 +dither -depth 8 "%c" histogram:info:

You can remove the -depth 8 if you run Q8 IM rather than Q16 IM

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T13:37:16-07:00
by snibgo
The histogram method finds unique colours, then sorts them, so it isn't fast.

fmw: But color reduction might eliminate the queried colour?

How about this (Windows 7 script):

I want to count how many pixels are (252,0,0). I make all the other colours transparent, then convert these to black (or any colour other than 252,0,0).

Code: Select all

convert in.png ^
  -alpha opaque ^
  +transparent rgb(252,0,0) ^
  -background black -flatten ^
  -format %%c histogram:info:out.txt
Sample output:

245169: ( 0, 0, 0,255) #000000 black
55: (252, 0, 0,255) #FC0000 rgba(252,0,0,1)

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T13:39:13-07:00
by bartgrefte
fmw42 wrote:PNG files are 24-bits per pixel (8 bits per rgb channel) on Q8 (or 48-bits per pixel on Q16). Thus you have millions of colors. You can speed things up if you reduce the number of colors in your image and if you run Q8 IM.

convert image -colors 256 +dither -depth 8 "%c" histogram:info:

You can remove the -depth 8 if you run Q8 IM rather than Q16 IM
We are not allowed to edit the PNG's, we must only check if they meet the requirements.
There aren't much colors in there. Mostly there are 2, or 3 (all with a specific RGB value) depending on the picture, no more.
Unless that is something that's wrong, that there are more colors in there that should be. But I've never run into a PNG with more than 5 colors.

I only want to know why convert needs more time, and has high memory usage, with some of the pictures and if it is possible to speed up the process.
What we only need from convert are the used colors, there is no need for convert to count the number of pixels all the colors use, if that is whats causing the delay with some of the pictures.

As for Q8/Q16, I'm currently using the convert.exe from the portable version (where the convert.exe is ~5MB in size). Because that one can run without needing Image Magick installed, just needed 2 dll's to be in the same folder.
snibgo wrote:The histogram method finds unique colours, then sorts them, so it isn't fast.
It's fast enough with most pictures, and like I said, I only want to know what's causing the delay and high memory usage with some of the pictures.

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T13:48:55-07:00
by snibgo
fmw's technique doesn't alter the image file, merely an in-memory copy.

As far as I know, performance should only suffer when there are a large number of distinct colours in the image (many more than 5!).

My script won't suffer when there are many distinct colours.

Can you post one of the poor-performing images? What version IM are you on?

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T14:50:49-07:00
by fmw42
you can get the number of unique colors from IM by using:

convert image -format "%k" info:

you can get a list of unique colors by using

convert image -unique-colors -depth 8 txt:-

This will be faster I would hope that using histogram:info:

Please post a link to a sample PNG for us to examine. One that you think runs slow with historgram:info:. Unless it is very special, you probably have more colors than you think.

By the way, if your image has more than 1024 colors in it, histogram:info: will color reduce your image to 1024 colors.

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T15:08:53-07:00
by snibgo
By the way, if your image has more than 1024 colors in it, histogram:info: will color reduce your image to 1024 colors.
Is it supposed to? It doesn't under IM 6.6.0-8.

convert photo.jpg -format %c histogram:info:out.txt

for a typical photo gives many thousands of entries.

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T16:18:40-07:00
by fmw42
Perhaps that has been changed at some point. At one time it was limited to 1024 (as I had asked Magick about it). If you get more, then I suppose they have changed it, but I don't know what the limit is then now.

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T20:26:24-07:00
by snibgo
An artificial example of 12 million pixels, each of a different colour:

Code: Select all

convert ^
  -size 3000x4000 xc: ^
  -sparse-color Bilinear "0,0 White 2999,0 Red 0,3999 Blue 2999,3999 Black" ^
  allcols.png

convert ^
  allcols.png ^
  -format %%c histogram:info:out.txt
The convert..histogram takes 14 minutes and 380 MB memory, then fails with "convert: unable to allocate string `Not enough space' @ fatal/string.c/ConstantString/637.".

A less extreme 1000x1000 takes 66 seconds and writes the expected million lines.

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T21:16:33-07:00
by fmw42
I guess it has been changed. But for speed and less accurate results, the histogram ought to have an option for the number of desired bins (at least options for 256 bins or 1024 bins).

For example:

histogram256:info:

histogram1024:info:

or some other way, such as -set option:histogram:bins

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T21:29:04-07:00
by snibgo
Yes, provided that does colour reduction, rather than simply discarding any colours after the 1024th or whatever.

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-27T21:32:04-07:00
by fmw42
snibgo wrote:Yes, provided that does colour reduction, rather than simply discarding any colours after the 1024th or whatever.

It should just bin the colors into so many bins and report say the center color of the bin.

I just now posted a question to the Developers forum. See viewtopic.php?f=2&t=16300#p59188

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-28T09:37:10-07:00
by bartgrefte
fmw42 wrote:you can get the number of unique colors from IM by using:

convert image -format "%k" info:

you can get a list of unique colors by using

convert image -unique-colors -depth 8 txt:-

This will be faster I would hope that using histogram:info:

Please post a link to a sample PNG for us to examine. One that you think runs slow with historgram:info:. Unless it is very special, you probably have more colors than you think.

By the way, if your image has more than 1024 colors in it, histogram:info: will color reduce your image to 1024 colors.
Samples? Uhm, as far as I know, I am not allowed to put stuff like that online.
However, since there is a zip published on the website of my work with examples, I don't think it's a problem I post a link to that here.
http://www.kadaster.nl/klic/documentati ... agde_1.zip

The 1st command gives 43 with one of the pictures from that file, please note, that all those pictures are rejected. Had my batch retrieve all the info we need to check .... those pictures are a mess :P. Unfortunate there aren't any troublemakers in there.
Well, at least none that have (4 times) as high RAM usage I've had with some other pictures.
RAM usage is around 116MB with each of those samples, one of the CPU cores is used completely, but I'm expecting that to be normal.

Second command: 42.

If I run into a troublemaker again, think I remember one so gonna look it up next week when I'm at work again, will the info from those two commands be useful? Or do you need more image characteristics? Since the one I know I can find, is not a sample, I'm not allowed to publish it. But giving some characteristics ain't a problem as far as I know.

Btw, those commands aren't faster, it takes the same time.

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-28T10:13:21-07:00
by fmw42
How big are your files? Can you post a link to an example of one of them?

What is your IM version and platform? How much memory do you have?

Re: Convert sometimes very slow with creating histogram

Posted: 2010-05-28T10:26:48-07:00
by snibgo
But I've never run into a PNG with more than 5 colors.
LG_gas+hoge+druk_Enexis_0000546670_09G267447.png has 58 colors.

58 is still small, no problem. You may find the troublemakers have many more.

My suggestion, in my first post, will only make a difference for large numbers of colors.

If a troublemaker has a small number of colors, you might:

identify in.png -verbose >in.txt

and post that imformation here (if you feel able).