Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My turn to nitpick!

RGB to grayscale: V = r x g x b/3 (one line).

The Sobel operation is at most 4 nested loops (really only 3) for a total of ~7 lines of code, depending on how you like your white space.

Fold the summation into your Sobel function without requiring another line.

The most complicated thing used from the OpenCV library (which I'm well aware is C++!) is JPEG decoding. OK, I'm avoiding swirling, but then it isn't needed to solve the CAPTCHA anyway.



Show me working code and then we can talk. :)

Oh, BTW, "V = r x g x b/3" is incorrect. This is not you convert a color image to a grayscale image.


Ok, I'll do. I will assume that we already have the image in an 1-dimensional array, each element containing an array of the RGB values. Then we can convert it to grayscale with one line of python.

    gray = map(lambda p:sum(p)/3,IMG)
If you want to test it, you can use this 3x3 sample image (or just load one):

    IMG = [[5, 5, 5], [6, 8, 9], [94, 123, 4], [54, 5, 32], [44, 3, 3], [34, 234, 33], [5, 5, 5], [6, 8, 9], [94, 123, 4]]
The result will look like this:

    [5, 7, 73, 30, 16, 100, 5, 7, 73]
If your image happens to be loaded into a 2-dimensional array, use this:

    gray = map(lambda row:map(lambda p:sum(p)/3,row), IMG)
Sobel is slightly more complicated, but can be written like this (you'll need a larger sample image though):

    width, height #width and height of our image
    IMG           #2D array of our RGB values
    sobel         #result image
    # The actual filter starts here:
    for x in range(1,width-1):
        for y in range(1,height-1):
			sx = IMG[x-1][y-1]+IMG[x][y-1]*2+IMG[x+1][y-1]-IMG[x-1][y+1]-IMG[x][y+1]*2-IMG[x+1][y+1]
			sy = IMG[x+1][y-1]+IMG[x+1][y]*2+IMG[x+1][y+1]-IMG[x-1][y-1]-IMG[x-1][y]*2-IMG[x-1][y+1]
			sobel[x][y] = Math.sqrt(sx*sx+sy*sy)
A total of 5 lines. And I have to thank you, because I finally just learned what convolutions are while doing this.


Excellent! Now we are starting down the right path. I don't have time to fully look at your code right now. I will later.

Just one quick observation. You can't convert from RGB to grayscale by simply averaging the three values. Each color channel influences the perceived luminance (grayscale) differently, with green being, by far, the largest component and blue the least significant. Rather than give you the answer I'd suggest you research "converting RGB to grayscale" or "converting RGB to luminance" as this is an important subject to understand if you are dealing with images.

I'll take a look at the rest of it later and comment.


If you had done any serious dealing with colors you would know that there are multiple ways of converting to grayscale. I personally know of three: average, lightness and luminosity. All three are ways of converting to grayscale and every good image manipulation software (GIMP, Photoshop...) will offer you all three. I picked the first one because it's the one that is easiest to understand and the one that almost everyone will be able to come up with.

> You can't convert from RGB to grayscale by simply averaging the three values

Of course I can

Edit: I just googled "RGB to grayscale" and this was the first result: http://www.johndcook.com/blog/2009/08/24/algorithms-convert-...

I recommend reading it


> If you had done any serious dealing with colors you would know

And that's the end of the conversation, isn't it?

You could have simply modified your program based on my friendly input and do it correctly. How was jabbing me in the eye a better choice? Particularly when you don't even know me. That's unfortunate.

Perhaps you might consider emailing me privately? I'll provide you with references. See my HN profile for the address.

I have only devoted somewhere between 20 to 25 years of my life to, among other things, deal with accurate color and image processing in both hardware and software. So, yeah, I know a thing or two about the subject.

There's doing it right and there's doing it wrong. Averaging RGB values to derive grayscale is --and I am trying hard not to say what I really want to say-- not the right way to do it.

Part of the context here is to consider the source of the images you might be processing. The very design of every single camera in the market is based on a relationship between these color primaries that is to be maintained across the processing pipeline.

No device I know of will uniformly average the RGB channels as this is simply the wrong way to process and deal with color accurately. You can get away with this kind of thing for very specific applications (if you are computationally limited AND know exactly what you are doing).

Even then, you can, as I have done in hardware a few times, massage the coefficients to better reflect reality. One such example is the implementation of a "cheap" motion detection facility in hardware (FPGA). In this case floating point math is not an option (and it wouldn't make sense) so you can either futz with the coefficients or use a set of pre-computed lookup tables to do it accurately.

In some cases you can even ignore red and blue and just use green as reference. Again, just like before, knowing the application and fully understanding what you are doing is critical when making such choices.

In this case you are trying to detect edges in an image that is, more than likely, not artificial. In other words, it might be a photograph. It, more than likely, came through or is a JPG image. This means that the image, regardless of source, was converted to YCbCr color space and then handed back to you as RGB. If you want to accurately work with actual image data and not some distorted, contrast-reduced or contrast altered grayscale image the only way to do it is to recover the Y component from the RGB source data by using the correct mathematical approach.

Really, it ain't that hard:

Y = (0.299 * R) + (0.587 * G) + (0.114 * B)

This corresponds to CCIR601. Things can get a little confusing as the primaries were modified slightly for REC709 (another imaging standard). JPEG is defined around CCIR601 primaries, so the above noted coefficients are correct for that application.

To anyone dealing with color professionally these shortcut "solutions" reveal nothing but utter ignorance in the underlying science. I do not intend this as an insult, it's just a fact. Saving a very specific and valid reason for taking such shortcuts these "solutions" are always a bad idea.

I happen to have a pretty good handle on --among other things-- color science. I am, however, clueless about building rockets. That said, if I wanted to build a rocket you can bet I'd spend a non-trivial amount of time learning as much about the subject as possible before using uninformed shortcut solutions.

Real Color Scientists cringe at this sort of stuff because, in darker times, it made it into all kinds of programs written by color-science-ignorant programmers. These programs caused untold havoc with image processing. Thankfully things are far better now as those doing serious work with images have taken the time to understand and learn about color science.

If you really want to learn to process images properly and accurately forget that the idea of (R+G+B)/3 ever existed, remove it from your vocabulary and replace it with the above.

Also, go browse around the Rochester Institute of Technology website. I spent a bit of time there. Color Science is one of their focal points. Lots of good info there, even a number of interesting courses.


Well. My post was intended to provoke an emotional (err angry) response, mainly because I have a personal problem with people being overly nitpicky and sounding extremely arrogant. I don't know if you're aware of it, but that's how you come across. I already knew about your background because you mentioned it in another post.

That aside, I really appreciate your effort of teaching me about converting RGB values to gray scale.

On another hand, I don't think that converting the image to something that is accurate to the human eye is something we want to do here (we're not going to show the result to one anyways). Using your formula we would end up overly favoring green levels for our edge detection, even though we want to treat all colors equally. Well, not if the next thing you're going to argue about is the significance of different levels of blue to edge detection and that green in general is the better color for detecting edges.

Call me mean or just stupid, but for my part I think we both should reflect on how much amateurishness we can tolerate and when to just not reply to something.

Sometimes trying to show someone how ridiculous he comes across by mimicking his behavior leads to people wasting countless hours on a pointless internet argument, that at some point is fighting over something that isn't even really related to the main points the authors were trying to prove.

I would have usually left it like that, but in this case I felt bad because you actually made a great effort explaining all that stuff and yourself. Thought it might be fair to tell you how I see this.

Still some interesting stuff, but a lot of information. You should get a blog and write some lengthy (in a good way) posts about this. You seem to have quite a lot to tell people about this, and also a desire to.

I'll continue feeling a little bad about this while I'm sleeping.

Good night

chmod

Edit/PS: Yes, I thought about Blade Runner while writing the first sentence.


Ah, Blade Runner. You are one of the good guys.

No, I didn't get angry (and no, I am not a Replicant). It actually saddened me that what seemed like a useful conversation stopped dead-cold with such a personal comment. As I said, I have devoted a lot of my life to image and color processing. I guess it's part of the problem of being somewhat anonymous, something I am moving away from slowly.

Look, it's easy to come off as arrogant over email, newsgroups or similar means of communications. Part of it is that sometimes people take it the wrong way when someone comes out in an authoritative manner. I do. However, I only do that when (a) I really, really know what the hell I am talking about and (b) I don't have the time to write twice as much text to cover all corner cases and be sure that everyone sees me as "nice". I've heard talks by Linus where I've thought he came off as an arrogant asshole. Then I slowed down and realized where he was coming from. Once you understand that it all makes far more sense and, yes, it stops feeling arrogant.

I'm not 16 any more, so I don't really care about seeming "nice" online because, well, it's hard and it takes time. This, for me, isn't a popularity contest. I'm simply, honestly, trying to share something and learn as well. For example, I don't use Python that much at all. Inspired by this thread I sat down and played with Python quite a bit. That's a good outcome, at least for me anyway.

With regards to the idea of favoring green more than red and blue. This isn't the intent of the equations. This is actually what happens in the real world. If you look at the spectral power distribution of a captured image you will see that, generally speaking, there's a lot more energy around the green portion of the spectrum. I am over-simplifying and cutting corners here, but that's one way to think of it.

In other words, in normal images with normal lighting there's far more green stuff than red or blue. And so, in converting an image to a grayscale representation you have to account for the fact that green contributes to the image twice as much as red and six times as much as the blue component. If you don't apply these weights to the image you are going to be evaluating such things as noise and attributing far more value to image structures in the other channels.

Another generalization is that image noise is generally found in the blue channel far more prominently than the other channels. If you simply average all three channels you are effectively amplifying the blue channel. Blue should have had a weight of about 10% and you are giving it 33%. You have just tripled it's importance and, if there's any noise there you've just multiplied it by three. When it comes to green, you are halving it's contribution from about 60% to 33%. Here's the component that generally contributes the most information to an image and, by averaging it with the other colors, its contribution is now cut in half. Finally, red is the component that suffers the least (almost not at all) from averaging. Red contributes about 30% to an image; averaging amplifies it to 33%.

With regards to a blog. Actually, I've been thinking about it. Maybe later this year. A blog feels far more "serious" than posting in places like HN.

Don't feel bad either. Life is too short to get worked up about stuff that, in the grand scheme of things, matters not at all.


The problem is that you've entirely missed the point. The code solves the problem. We don't care about recreating grayscale to match human perception, or whatever, we care about solving the answer placed in front of us.


Sorry, it was late!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: