Public Lab Research note


Noise by colour

by viechdokter | | 1,014 views | 23 comments |

Read more: publiclab.org/n/12994


This is a revised note! There was a little error in my primary colour noise curves. The red curve was way too high up. That's because there was an undetected uncalibrated spectrum left when I gathered the CSV files together. I stumbled across the fact that the noise in the red channel was 140 which could not be right. So I went through the data again and found the error. Now the revised red noise curve should be right.

The question: Camera noise seems to be an issue when assessing light intensities of specimens (although I think that wavelength peaks are more important when trying to get "fingerprint evidence"). What I wondered was, which colour channel has the most noise and at which wavelength.

The data: I took 46 spectra of a Philips 5.5 Watts 350 lumen 2700 Kelvin energy saving LED lamp at 5 minute intervals. (Thanx again to @Warren for his macro!) It has a nice white light with a pretty smooth average curve. I then looked at the CSV files and extracted the values for

1) average 2) red 3) green 4) blue channels

per wavelength. Then I took the maximum and minimum of the average, red, green and blue values for each wavelength and calculated the difference. This should be a value for the noise. (For instance highest blue value minus lowest blue value at a certain wavelength.)

The results: Here are the "noise curves" per channel:

Average noise curve:

noise_of_average_curve_Philips_LED_reproducibility_test_01-46.jpg

Revised red noise curve:

red_curve_noise.jpg

Green noise curve:

noise_of_green_curve_Philips_LED_reproducibility_test_01-46.jpg

Blue noise curve:

noise_of_blue_curve_Philips_LED_reproducibility_test_01-46.jpg

Thoughts: When I look at any of the spectra I have captured for this test I see that there are three peaks: a red, a blue and a green one. Not surprising.

spectrum_and_curve_Philips_LED_reproducibility_test_44.jpg

The red intensity peak is almost as high as the blue peak while the green one is lower. So how about the noise? Shouldn't the red noise be as strong as the blue noise?

As expected the green noise reaches only about 26 intensity points, whereas the blue noise gets up to 46 and the red noise reaches 52 which is close enough to blue.

The noise curve peaks:

  • the average curve peaks at about 430 nm and 565 and 572 nm

  • the highest red noise peak is at around 420 nm

  • the highest green noise peak is at about 577 although the green noise curve is much more "compact" than the other colours.

  • the highest blue noise peak is at about 565 nm

Things I wonder about: One could think that noise is highest in places where is less of a colour (in dark places where the Automatic Gain Control of the webcam works hardest). We find the most red noise in the "blue" area while we see the most blue noise in "non-blue" areas.

Next step: I will take the wavelengths with the peak noise and have a look how the actual red/green/blue/average values change with time.


I did this Help out by offering feedback! Browse other activities for "spectrometry"


People who did this (0)

None yet. Be the first to post one!


23 Comments

Good observations! Remember that the AGC is likley based on the sum total over the entire image; not per pixel. If there were independent AGC per color or per pixel, the image would be very mess-up most of the time.

The Green channel likely has inherent lower noise because the Bayer filter in the camera is RGGB so double the number of sensors for G so the camera likely averages out some of the noise in that channel.

The Red is in the longer wavelengths where thermal noise may increase as an additive component (relative to all other channel noise sources. There might? also be more thermal noise at that end from the lamp despite being LED based -- but this is really just an additional question.

Is this a question? Click here to post it to the Questions page.


@stoft : RGGB like in this picture?

RGGB.jpg

Is this a question? Click here to post it to the Questions page.


Right. That's the Bayer filter pattern which sits in front of the imager chip and gives double the signal sensitivity for Green -- a simulation of the human eye's sensitivity. The camera does and demosaicing algorithm to estimate RGB per pixel since each pixel sensor only detects 1 color.


@stoft : sorry I had to revise the research note as there was a big mistake in the red curve. I exchanged it with the new revised red curve and now things look different. The red noise peaks mostly in the blue area whereas the blue noise peaks mostly in the red area of the spectrum. Sorry about the mistake. Hope one day you can forgive me...

If you want to have a look at the raw data I attach the files:

average_curve_noise_01-46.ods

red_curve_noise_01-46.ods

green_curve_noise_01-46.ods

blue_curve_noise_01-46.ods


Not to worry, I too find bits to revise after I publish; what counts is fixing the details as they appear. One thought on visualization. When I do comparisons of common factors between different tests, I find it easier to visualize those differences by plotting on the same graph so the Y scale is the same. It can be less helpful when the absolute values are quite different but often with things like %noise, %error, etc. RGB color curves on the same plot is easier for the eyes ;-)


Yes, you are right. I, too, like comparable curves better. I used LibreOffice, a kind of public domain "Excel". It plots diagrams automatically and chooses the heights as is best for each curve. To get the same Y scale I would have to combine curves in one diagram, I guess. Or do you know of any program that does statistics and visualisations better? (automatic feed-in of CSV files included)

Is this a question? Click here to post it to the Questions page.


I believe both Excel and Quattro Pro provide for multiple plots. I suspect your's may as well -- in Quattro it's under the specialty xy plotting. I'd be surprised if any spreadsheet didn't offer to plot multiple columns of data vs a single column for X. Google search indicates you just need to use LibreIOffice's XY plots.


Yeah, I think I could do it. Hmh, I must admit that I only pretty recently started to use Excel-like programs at all. To be honest, your research notes impressed me and inspired me to find out more about nature in those CSV file numbers. Hey, I am learning more about webcams recently than I have ever thought I could. A webcam was just some "light sensor square" to me until I started to think about those things and talk about them with you and others here. The first layer of knowledge is the patterns I can derive via spectroscopy and some "crude" statistics. And I feel there is an even deeper layer of knowledge worthwile to be explored ("classical" electron particle level) and behind that another (quantum level)... Why is there noise at all? How is it generated in detail? What do the electrons want to tell us...? ;-) Isn't it great that we can research such things at home and talk about them with fellow researchers over the internet? Aristotle and Demokrit would become green out of envy...

Is this a question? Click here to post it to the Questions page.


Would he be low-noise green? :-)

I'd also love to see them plotted on the same graph, maybe with the graph lines themselves in appropriate colors. Great note! I'm adding relevant tags too.

Is this a question? Click here to post it to the Questions page.


@warren:They would be low-noise green if you average them. ;-)

@stoft: BTW, I noticed that curves that were averaged over two or more channels look more rectanglish ("average" and "green"), whereas the single non-averaged curves ("red" and "blue") look more "rounded" - so this, too, suggests that you are right about two green channels having been averaged here.

average_versus_red_detail.jpg


Hmm, very interesting -- without vertical scale its hard to say though - I believe the "plateaus" on the "average curve" are artifacts of the precision of the intensity data -- like, it can't record between the two, so it's jumping to the nearest intensity above or below.

For example if precision were to the tenth of a value, if the real value were 0.15, we'd only see either 0.1 or 0.2. Make sense?

Is this a question? Click here to post it to the Questions page.


Yes, it's called quantization; the camera only has 8-bits and the scale of the two plots is likely about 3-5x different. Take a look at the 'red' plot and the vertical 'size' of the smallest increment .... then look at the 'averaged' smallest increment. If this is not the case, then I'd suspect the math used in doing the averaging: you want: int16( (double(r)+double(g)+double(b))/3.0 ) or the equivalent and then plot on the same plot with the same scale.


Talking about bits and quantization: I also noticed that certain values (2.55 and multiples thereof like 5.1, 7.65 ...) appear in every spectrum's CSV file. So 2.55 seems to be the "Planck quantum" here.

quantization.jpg


Okay, here I combined the 4 channel max/min-difference curves in one diagram so they have the same vertical scale now:

all_curves_in_one_diagram_Philips_LED_reproducibility_test_01-46.jpg

And here the combined data for you to check or play around with:

all_curve_noise_01-46-1.ods

all_curve_noise_01-46-1.xml


Well, except there's still a problem with the plot. The smallest increment looks to be '4 steps' per scale units of '10'. Either there was some multiplier involved in plotting or processing or there's some odd error.

As for the previous data with '2.55', I don't know how that relates to the source data -- which is 8-bit with the PLab camera -- so that camera can only be integer 0-255. Perhaps your source is something else?

Is this a question? Click here to post it to the Questions page.


I took the data from the CSV files that Spectral Workbench provides with my spectra. I had a look at one of dhaffners spectrum CSV files and see the same "Planck quanta" there:

quanta_dhaffner.jpg

As for the plot: I enlarged it so it would basically fit my screen, and after that I put together screen shots in Photoshop, so the x- and y- directions probably won't have the same enlargement factors. The actual curve data I took from the CSV files I got with my spectra from the Workbench. As I put them into the same diagram their heights should have the same x- and y-scale factors. I included the CSV-data and the original curves in ODS and XML format above if you want to have a look.


Hmmm, well the RAW data should have the same scale units which would be unsigned 8-bit integer from 000 to 255 -- so the Y axis should show noise having it's smallest vertical increment shown on the Yaxis scale as a unit of '1' -- so for a scale Y inc of 10, you should be able to see 10 steps of the smallest quantization of noise.

As for the CSV files from SWB, if it's raw data, it should not have interpolated values .... but, with that said, SWB does do some "scaling" and so, unfortunately, it can be very unclear if the CSV data someone exports has been scaled. (They also scale the X axis after CFL calibration which is the source of the "higher-precision" wavelength data which can be misleading as well). Assuming the '2.55' values are from a scaled plot, then they could make sense -- i.e. 2.55 is thus the scale factor which represents an 8-bit '1' value. (This does look to be the case as the next value up should be and 8-bit '2' which is 5.1 scaled ... which does appear in the list.) So, if you're dealing with SWB scaled data, you could 're-scale' that data back to 'raw' and then plot -- the y-axis should show 1 unit per noise quantization minimal value of 1.


I could re-scale but ... when everybody here gets the same minimum 2.55 increments here and you want reproducible spectral data then everyone should work with the same data they get from the Workbench, don't you think?

What about your own spectra? Do you use a different capture software?

Is this a question? Click here to post it to the Questions page.


Hi, all -- 2.55 is definitely related to the 0-255 value. The code that generates the CSV is here:

https://github.com/publiclab/spectral-workbench/blob/675d11793927fa2507eebd6c075d3304711842b7/app/helpers/spectrums_helper.rb

The original data the CSV is based upon is encoded here: https://github.com/publiclab/spectral-workbench.js/blob/ef52eac98c31066bf438fd05c09b5241e64487ae/src/SpectralWorkbench.Spectrum.js#L119-L144

If it's an error, we should definitely fix it! There's also potential for a simpler implementation in the new spectral-workbench Node.js implementation. I recently coded a CSV decoder here:

https://github.com/publiclab/spectral-workbench.js/blob/ef52eac98c31066bf438fd05c09b5241e64487ae/src/SpectralWorkbench.Spectrum.js#L29-L39

And opened an issue for a new CSV encoder here: https://github.com/publiclab/spectral-workbench.js/issues/12

There seem to be two places to fix it if this is indeed wrong:

  1. ensure output is max 255 instead of max 2.55 by multiplying by 100
  2. displaying/storing all data in 0-255 instead of percentages (we'd have to be sure this doesn't affect legacy data/operations, and doesn't conflict with other math we do)

the Node.js package is very thoroughly tested, so if we did introduce such a change, it'd be pretty easy to see if it breaks anything.


It's ok to have a 'scaled' data file -- but ONLY if that file is clearly identified as such ... like /SpectrumData-SCALED_20160420.csv or /SpectrumDataCALD-SCALED_20160420.csv, etc.. Otherwise, yes, all csv data files should be in native resolution (8-bit) for now. @Warren and I have had discussions about this general topic but I think results have yet to be implemented in SWB.

I have difficult with SWB (the UI, latency, hung scripts, etc) so I don't bother with it and use Matlab instead. I can read the USB camera directly at many frames per sec and then manipulate and plot easily. None of that is available with SWB.


Talking about hung-up scripts, it often is a pain in the neck to login to SBW. "Do you trust this site with your identity?" "Yes" "You must login to..." and so on. Often takes three or more attempts to login.

Is this a question? Click here to post it to the Questions page.


I went ahead and created a specific issue for native 8-bit storage: https://github.com/publiclab/spectral-workbench.js/issues/18

As well as outlined the exact lines that'd need to be modified, including the tests (which currently use scaled data).

Besides rejiggering the tests, the legacy data issue is a big one, but we could let people "turn this on" optionally for legacy data, and make it the default for future data.


@warren: I just had a look at an old spectrum CSV file of yours (3 years ago):

https://spectralworkbench.org/spectrums/481

There it was unscaled data, i.e. integer values, but interestingly I saw some "257" numbers in there.


Login to comment.

Public Lab is open for anyone and will always be free. By signing up you'll join a diverse group of community researchers and tap into a lot of grassroots expertise.

Sign up