Thresholding
Global Thresholding
A threshold defines an upper-limit, which is exactly how Thresholding works in Computer Vision. We convert the image to grayscale, then we use some kind of limit between 0-255, let's say 128, now any number below 128 will be set to 0 and any number up will be set to 256, effectively reducing the colour info of a image to a binary scale of white/black.
This has several use-cases, for instance in OCR, or digitising scanned hand-written text relies heavily on this method to accentuate the pen strokes and to minimise things like shakey strokes or the texture of the paper from getting onto the scanned copy. In a similar way this is also useful in recognizing objects since it essentially gives us an imprint of the objects present in an image. Let's see how this is implemented.
Implementation
Global thresholding is just a function which takes in an image and a threshold and normalises the pixels to be either white or black based on that.
Global thresholding is implemented at src/thresholding/global.c
We first convert the image to grayscale, this is crucial since we need to reduce the colour info to a "scale" and not RGB values
We loop through each pixel
If the pixel_value <
threshold
then we set it toGRAYSCALE_WHITE
(255)If the pixel_value >
threshold
then we set it toGRAYSCALE_BLACK
(0)
Result
Problem
Manually setting a threshold involves a lot of guess-work, one threshold that works now might not work another image, this is because there are several variables e.g. Shadows, Contrast, Lighting involved that make each image different. a value of 128 is the mid-point but might not work for lighter or darker images.
Otsu's Method
Since each image is different, Nobuyuki Otsu came up with a method to compute the threshold for each image based on an histogram of each pixel and their corresponding 0-256 colour value.
The Otsu threshold implementation works by calculating a class variance between each of the histogram bars, 255 to be precise, the key here is to extract the bar with the maximum class variance and the label (not the value!) of that bar will be our threshold.
Implementation
The Otsu algorithm is implemented at src/thresholding/otsu.c
Let's break down what is going on here.
To begin with we compute a
histogram
, which is an array where each index corresponds to a value between 0 and 256, and at that index is the amount of pixels of that colour within the image.We then need to normalise this histogram to present it as a value between 0..1 or express it as a fractional value.
We then loop once across each of the "bars" within the histogram and for each we do the following
We add to the
cumulativeMean
, which is the sum of all the means up until the current iterationWe add to the
cumulativeSum
which similarly is the sum of all normalised histogram values until the current iteration
To then actually calculate the class variance we apply the formula to give us the actual variance
To find the threshold, we need to find the max variance, so we store the variance in a
maxVariance
and if the current is higher than previous, we re-set the variable and set the threshold accordingly.
Finally, we use the the previous Global Threshold function to apply the results to the image
Result
[INFO] Applying optimal threshold of 116
The results are calculated instead of being arbitrary which creates a more appealing output.
Last updated