Machine learning - but learning from what?

Machine learning – but learning from what?

10 febbraio 2022

Our goal is to make the computer learn how to denoise images. But how can it learn, and from what? And what is noise, anyway?

Creeping noise. Source.

A very technical explanation claims noise is something that makes your picture look bad. In pictures, it mostly occurs through two mechanisms:

Poverty. That’s why a picture shot with a top-notch camera at ISO 102,400 still looks usable (~ 7,500€), while another shot with an entry level one (~ 350€) at ISO 6,400 looks like a bunch of pixels with values thrown with dice rolls.
Quantum mechanics. Light is actually made out of tiny individual particles (that are also waves, that are also particles, and so on), called photons. At the end of their journey(s), photons ultimately crash into the sensor, generating an electric current.
When the light is low, when the shooting time is fast, or both, there’s fewer photons smashing into the sensor. This results into random fluctuations in each pixel reading due to the uneven arrival of photons from the source, possibly amplified by the tries of the sensor to amplify this feeble signal by forcing more current into its circuits.
Last, this analog signal needs to be converted into digital form (another source of noise).

What can we do when quantum mechanics, thermodynamics, bad luck and other major forces of Nature conjure against us? We fight back.

Just like you would imagine how a noisy picture would look if it weren’t noisy, because you know from experience what a clean picture looks like, the computer can also learn in a similar fashion. Problem is, your computer never went on holiday, took pictures and spent time reviewing them eating pizza.

So that’s what we’re going to do: we will build a dataset made of paired pictures: one clean, one noisy. We will make a lot of them and will then feed them to the computer. Settings:

clean picture: ISO 200
noisy picture: ISO 1,600 pushed + 2 EV (ISO 6,400 equivalent)

What’s ISO anyway? It’s a value related to the sensitivity of your sensor: the lower, the lower (low ISO == low sensitivity and vice versa).

If you can shoot at low ISO, this means light is plenty and/or you can afford longer exposure times. This results into a noise-free image. If you cannot, it means that light is low and/or you must shoot very fast to freeze a moment (such as water moving), and you need to capture more information in less time. This results into noisy images.
ISO values double when the light required to take a picture halve. This is convenient, as ISO 100 and ISO 200 are spaced just like ISO 800 and ISO 1,600 are.

Practical example: a picture taken at ISO 100, 1/1000 second or ISO 800, 1/125 second will look exactly the same (to the exception of noise).

I will be using my Canon EOS 1200D for the making of the dataset. Reasons:

That’s the camera that I have.
It produces fantastically noisy pictures, which is good for this project (and, incidentally, the reason for it)
On the top of abysmal high-ISO capabilities, ISO invariance is also horrible

Most datasets out there are built by artificially introducing gaussian noise into the images. Our main goal is to produce a dataset containing real noise, thus we will take two pictures of the same thing. As mentioned above, one will be clean (ISO 200). The other one will be shot at ISO 1,600 (a value my camera already begins to struggle with), and artificially boosted to ISO 6,400 equivalent, by raising +2 EV the picture. Because of the poor ISO invariance, this results into an even noisier image than one actually shot at ISO 6,400. Example:

This will likely cover real use-case scenarios, where I will need to push ISO 1,600 images upwards. You will find the datasets here.

Coming up: how we’re gonna digest all those images to feed them to the computer for learning.