4. Image Processing Flashcards
(37 cards)
What is image processing / filtering? Why is it done?
It is converting one image into a different one according to some algorithm
- Reduce noise (smartphones always have noise)
- Fill-in missing information (like in bayer grid using demosaicing)
- extracting image features (edges, corners)
What are images mathematically?
It is a function that converts (a, b) (c, d) -> [0, m]
Where a, b, c, d are dimensions of the image (start, end of row/column) and m is the max value of the result (like 0-255 in grayscale)
[0, m]3 for color images (r,g,b)
What are the properties of linear filters?
- homogenety: if you amplify image and then apply the filter, it will be the same like if you first apply the filter and then amplify.
- additivity: applying a filter to an image computes like a sum of two images is the same like applying the filter to each separately and then summing them
- superposition: combination of the two
How can linear filters be done using vector-matrix operations?
We first convert an image-matrix into one vector of all values (1x200000) and then multiplying it with a matrix 2000000x2000000.
This is very expensive for just a simple linear filter
Problems with shifting image for a few pixels to one side and then performing the filter will result in a different image compared to filtering and then shifting. (shift-invariance)
Convolution is better
What is convolution? How is it performed?
It is a way to perform a linear filter on the image.
We have a filter (kernel). We perform a sliding window over the whole image and multiply each number in the image with the OPOSITE filter value.
f*g
What are the properties of convolution?
- can be represented as matrix-vector product
- linear
- associative
- commutative
- shift-invariant: if we shift pixels, the convoluted image will be the same
What is correlation?
It is similar to convolution but we do not mirror the filter (we multiply the pixels with the kernel on the EXACT position)
What are the properties of noise-pixel? What could be the cause of those noise-pixels?
A single pixel that is the noise is an outlier of its neighbours (has much lower or higher intensity).
- Light fluctuations (more photons go to one sensor-cell compared to other ones), sensor noise (sending voltage levels around), quantization effects (continuous object is quantized on a finite grid, with a finite (integer) intensity)…
How to deal with noise using linear filters?
- Average filter: we use a 3x3 filter of 1/9 values (sum to 1) and the resulting pixel intensity is the average of its neighbours. This is also called a box filter.
- Gaussian filter: weighted average -> weights nearby pixels more than distant ones
What is a box filter?
It is a linear filter that calculates the pixel intensity as an average of its neighbors. It usually has the box-like looking artifacts.
What is a gaussian filter?
Essentially it is a linear filter that weights nearby pixels more than the distant ones, according to the gaussian distribution. We basically take one gaussian distribution for x-axis and one gaussian distribution for y-axis.
This technique removes the box-artifacts from the box filter and makes the filter more smooth.
How to make box filter and gaussian filter more efficient?
Since they are separable (can be represented as a convolution of two 1D filters) -> gaussian as a horizontal+vertical guassian 1D filter, and box filter as average of vertical+horizontal.
This of it as passing through the image two times, ones for all verticals and once for all horizontal pixels.
This way, instead of having 9 parameters (for 3x3 kernel), we have only 6.
How to deal with boundaries in convolution?
Since convolution reduces the image size (for 3x3 filter, it reduces the size by 2 pixels), we need to add some padding to the original image before the convolution. How to choose the padding?
- Leave them as 0 (black pixels): this introduces dark edges in the blurred image
- Wrap: pretend that the image is infinite and the next pixel goes around to the other side. n pixel becomes 1, n+1 becomes 2…
- Clamp: just extend the last pixel how much we need
- Mirror: similar to wrap, just n becomes n-1, n+1 becomes n-2…
Wrap method have a higher chance of introducing artifacts, but deciding is a matter of use cases and what images we have
How to remove noise but while preserving edges?
Using non-linear filter called median filter. Similar to average but instead of average, we take the median of neighbors (sort and take the middle). This preserves high jumps between the edge but kinda blurs the image.
What are morphological filters?
They are the filters that usually only apply to binary images. There are two types:
- Dilation: If there is at least one 1 in the image part of the filter, it results in 1. Has the foreground-expanding feature
- Erosion: Results in 1 only if all image pixels of the kernel are 1. If there is at least one 0, it results in 0.
This can be generalized for grayscale images
What are image pyramids? What is the general process?
It is representing one image at multiple scales (resolutions). It essentially creates multiple versions of the image with the lower resolutions so that there is a smaller search space to find an object. When it is found on the smaller image, then this info is propagated onto the bigger images and the object is pinpointed on the original image.
What are gaussian pyramids?
It is a pyramid where first we apply the gaussian filter and then do downsampling (take every 2nd pixel both row-wise and column-wise)
Why do we have to apply gaussian filter before downsampling in gaussian pyramid?
Because high frequencies (sharp edges, sharp color transitions) can’t be represented anymore.
We can experience aliasing: smaller objects might appear bigger in the downsampled image.
What is aliasing in the context of pyramids?
They appear when we don’t do any gaussian filtering before the downsampling in gaussian pyramids. This might cause that some small objects appear bigger in the low-resolution images (downsampled images). Also, it might introduce some patterns that badly represent the image.
What is edge detection and why do we need it? What are edges?
It is finding edges of objects in the image. Edges are basically lines that represent the object. We need it because humans can recognize objects only with their drawings/edges, so we can assume we can also find objects easier using their edges in image processing.
Edges are fast changes of intensity of pixels in the image (big color contrast).
What are the goals of edge detection?
- Good detection: the result corresponds to the edge of the object, not some noise
- Good localization: Edge is near the true edge of the object
- Single response: One line per edge
How to detect an edge in 1D? A line of the image?
The idea is that the edges correspond to fast changes (derivative is large)
- Apply gaussian smoothing
- Calculate the derivative of this curve
- Find local optimas of the curve but above some preset threshold (don’t care about low changes)
How to compute derivative of the 1D image-lines for edge detection?
- We can compute the 1st derivative and find local optimums. This can be implemented as linear filter 1 -1 but the derivative will be computed in between the cells (even number of kernel-cells)
- It can also be implemented using 1/2 * 1 0 -1 filter and it is better because we don’t have to shift the image, derivative is calculated at the center of the pixels
- We can calculate the 2nd derivative and find zero crossings (not just where the derivative is 0, but it crosses the y=0 line)
How can we simplify edge detection when having a gaussian smoothing and an edge-detection filtering?
By applying the edge-detection filter to the gaussian filter first, and then applying this to the image. It reduces the computation. This world because of the associative attribute of convolution.