Wednesday, September 10, 2014

AP 186 Activity 5 - Fourier Transform Model of Image Formation

In this activity, we will familiarize ourselves with the Fourier Transform. So what is Fourier transform?

Fourier Transform (FT) is the mathematical transformation of signals from spatial dimension X to a signal  with dimension of 1/X.  A Fourier Transform of the signal will be inverse space or spatial frequency. The FT of an image f(x,y) is given by



where fx and fy are the spatial frequencies along x and y, respectively. An efficient and fast implementation of FT is the Fast Fourier Transform algorithm or FFT by Cooley and Tukey. FFT is a very fast and powerful tool we will use in Scilab and I will show you how to use it and the effects of its transformation.  In Scilab, we will use the functions fft() for 1D signals and fft2() for 2D signals. [1]

Discrete FFT and FFT2 has these properties[1]:
1) FFT2 is faster and efficient if the image is a power of 2. An example is 128x128 pixels or 64x64 pixels. In this activity, the images I did were of 128x128 pixels.
2) The output of FFT2 has quadrants along the diagonals interchanged. That is, if the image quadrants are labeled in the clockwise direction as 4 1                                                                                                                                                              3 2 ,
                                    the FFT comes out as 2 3
                                                                       1 4 .
The fftshift() function interchanges the quadrants back.
3) The output of the FFT is a complex number array. To view the intensity use abs() function which computes the modulus of a complex number.
4) The inverse FFT is just the same as the forward FFT with the image inverted.

Part A. Familiarization
   Our input image (128x128 pixels) is a white circle in a black background which I did in GIMP. Using Scilab, I used this image and converted it into grayscale. Our image is in integer so I had to convert it into double. After converting, used fft2() to the image, then using abs() to the fft2 to view the intensity. I used imshow to show the resulting image, then used imwrite to save the resulting image. Shown in Figure 2, you may only see a black image, but looking closer at the edges, you can see that there are white pixels or gray pixels. This is obvious in the left topmost corner, were the white pixel is. When you do the fftshift to the FFT transform, the quadrants will go back to its original place because as said in property number 2 of discrete FFT the quadrants along the diagonals were interchanged. This FFT-shifted image is shown in Figure 3.


Figure 1. Original image, circular aperture.
Figure 2. After undergoing FFT.

Figure 3. After applying fftshift to the circular aperture.

Figure 4. Inverse FFT of circular aperture,
 We then use fft2() again to the FFT-shifted image to get the inverse FFT. The resulting image is in Figure 4. As said in the properties of discrete FFT, the inverse FFT is like the forward FFT but the image is inverted. How can we say that our image is inverted? Looking at the original image, the top side is thinner than the bottom side and the left side is thinner than the right side. Unfortunately, this is only distinguishable when you zoom in the original image of the circular aperture. So looking Figure 4, we see that the thinner sides are on the right and bottom which shows that our image is inverse FFT of our original circular aperture image.

I see what happens when the circle's radius was lessened. So I did the FT again, this time with a image of a smaller circle. Here are the results.

Figure 5. Original image of circular aperture.
Figure 6. After undergoing FFT.

Figure 7. After undergoing fftshift.

Figure 8. Inverse FFT.
As we can see, in figure 7, our circle when underwent FT and fftshift, it formed an Airy pattern. So what is an Airy pattern? As said by Wikipedia.org, "In optics, the Airy disk (or Airy disc) and Airy pattern are descriptions of the best focused spot of light that a perfect lens with a circular aperture can make, limited by the diffraction of light.This means that our original image of a circle acts like a perfect lens with circular aperture. Another observation is the radius of the circle and size of the Airy disk is inversely proportional. The larger the radius of the circle, the smaller the Airy disk is.

Figure 9. Code snippet for circular aperture.

 We will use another image, this time a white letter A in black background again using GIMP to make the image. So the same steps were made. First, FFT was applied then fftshift. The inverse FFT was done by applying FFT to the FFT-shifted image.
Figure 10. Original image of letter A.
Figure 11. Letter A after undergoing FFT.
Figure 12. Using fftshift to the fft image of A.

Figure 13. Inverse FFT of letter A. 

The FT of the letter A, shown in Figure 12, is very pretty. It looks like a star or asterisk. This is the result when you have a lens with a letter A aperture. As we can see in Figure 13, it is obvious that the image is inverted. 


Figure 14. Code snippet for FT of letter A.

To show that the FFT is complex, I will show the values of the FT of the circular aperture and letter A. I opened the variable browser of the FFT of the circular aperture and the letter A. Looking at the values, it is indeed made up of real and imaginary parts.

Figure 15. Values of the FT of circular aperture.
Figure 16. Values of the FT of letter A.

Now we do the FT of four images. The first is a sinusoid along x-axis also known as corrugated roof, second is a simulated double slit, third is a square function or a square aperture, and lastly is a 2D Gaussian bell curve. All of the input images before FT are simulated in Scilab, then the screenshot of the plot was taken. The screenshot image is scaled to 128x128 pixels using GIMP.


Figure 17. Original image of corrugated roof.

Figure 18. FFT of corrugated roof.
Figure 19. Code snippet for corrugated roof.

 As we can see the corrugated roof's FT are three small square pixels. The middle one is the brightest while the other two are same intensity which are in the color gray.

Figure 20. Original image of simulated double slit. 
Figure 21. FT of simulated double slit. 

Figure 22. Code snippet for FT of double slit.

 As we can see in Figure 21, the FT of a simulated double slit is like the diffraction pattern of a double slit experiment. There are light and dark parts to the FT of the double slit. So awesome!

Figure 23. Square function or square aperture.

Figure 24. FT of square aperture.

Figure 25. Code snippet of FT of square aperture.
The FT of a square aperture, shown in Figure 17, is like a star. It also has a pattern that is like the Airy disk.
Figure 26. 2D Gaussian bell curve.

Figure 27. FT of 2D Gaussian bell curve.
Figure 28. Code snippet for FT of 2D Gaussian bell curve.
For the FT of a 2D Gaussian bell curve, shown in Figure 27, we can see that it has airy disks around the white circle and a line of diffraction pattern along the vertical axis. 

Part B. Simulation of an imaging device.

For this part of the activity, we require two images both 128x128 pixels are shown in Figure 29 and a circular aperture image. These input images were made in GIMP and the font of VIP is in Arial. Our object image is the VIP image and we will use different circular aperture images. We will convolve the two images, the VIP (object image) and a circular aperture. To convolve means that we will get the product of the FT of the two images. But since our circular aperture is already in Fourier space, we only applied fftshift() to it. Afterwards, we applied the inverse FFT then the abs() of the inverse FFT to get the output image.
Figure 29. Original image of VIP.


Figure 30. On the left is the circular aperture used, on the right is the output image.

Figure 31.  On the left is the circular aperture used, on the right is the output image.

Figure 32.  On the left is the circular aperture used, on the right is the output image.

Figure 33.  On the left is the circular aperture used, on the right is the output image.

As we can see from Figures 30 to 33, the bigger the circular aperture radius, the sharper the image is. This is because more light rays pass through a bigger radius of aperture compared to the smaller radius. At Figure 30, we can't read the letters of VIP while in Figure 31 we can read it. Since the output is the inverse FFT, we know that it will come out inverted. In Figure 31, we can also see diffraction pattern outwards in the direction of the letters. 

This is the result because convolution is the "smearing" of one function against another so that the resulting function should a little like both of the two original functions [1]. As said in the manual[1], convolution is used to model the linear regime of instruments or detection devices such as those used in imaging. The first function is the object while the other function is the transfer function of the imaging device. Their convolution is the image produced by the instrument or detection system which in general is not identical to the original image. In this case, our two functions are VIP and the circular aperture while our result is the inverted VIP with diffraction patterns. Our detection system is a lens with a circular aperture. 


Figure 34. Code snippet for convolution.




 Part C. Template matching using correlation.
According to the manual[1],

P = F*G
where P, F, G are the Fourier transforms of p, f, g and the asterisk above F
stands for complex conjugation.
The correlation p measures the degree of similarity between two functions f and g. The more identical they are at a certain position x,y, the higher their correlation value. Therefore, the correlation function is used mostly in template matching or pattern recognition. An important consequence of the equation above is that if f or g is an even function, the correlation is equal to the convolution. [1]

To show how to use correlation, we will use two images, shown in Figure 35 and 36, which are made using GIMP. The A in Figure 36 has the same font and size as the text in Figure 35 is. I got the FFT of the two images then applying conj() to the FFT of Figure 35. We then multiplied this to the FFT of letter A then applying the inverse FFT and getting its abs(). Now, it is not yet complete since the quadrants are mixed up so I applied fftshift().

Figure 35. Original image to be correlated with Fig

Figure 36. Original image of letter A same font size as Fig

Figure 37. Output image using correlation.
The output is shown above and as you can see, the brightest dots are in the A letters of the words. As said the correlation is equal to the convolution and it is highest when there is a match, that is why the letter A is brightest. It matches the convolution of the single letter A.


Figure 38. Code snippet for correlation.


Part D. Edge detection using convolution integral

Edge detection is just like template matching but in this case we use a 3x3 matrix to represent our edge pattern and an image (VIP image). For the matrix pattern, it must have a total sum of zero and a 3x3 matrix. I used three matrices (that came from the manual[1]), one for the horizontal, one for vertical and one for spot as shown below:

 Figure 39. Horizontal edge matrix(left), vertical edge matrix (middle), spot edge matrix(right).

 I then used the function conv2() in Scilab to convolve the image and the edge matrix. I used mat2gray() to show the resulting images in grayscale.

Figure 40. Original image of VIP.


Figure 41. Using horizontal matrix pattern.

The result of the convolution of the horizontal edge matrix and image is that the horizontal edges of the letters, which includes the curves that are in horizontal position, are sharp while the other edges are blurred.
Figure 42. Using vertical matrix pattern.
The result of horizontal edge matrix is that the vertical edges of the letters, which includes the curves that are in vertical position, are sharp while the other edges are blurred.
Figure 43. Using spot matrix pattern. 
The result of spot edge matrix is that all of the edges of the letters are sharp. I think that this tries to match the edges of the image with the edge matrix and that is why the edges are bright and sharp because when it matches, it has the highest value just like in the correlation.

Figure 44. Code snippet for part D.

I would like to give myself a score of 10/10 since I understood the activity well and I tried changing the radius of the circular aperture in part A.

Thank you very much to Ma'am Jing Soriano for answering my queries in my code and about the instructions in the manual.

References:
[1] M. Soriano, AP 186 Laboratory Manual, A5 Fourier Transform Model of Image Formation manual, 2014.

[2]  R. Fisher, S. Perkins, A. Walker and E. Wolfart,  R. Fisher, S. Perkins, A. Walker and E. Wolfart. Image Transforms - Fourier Transforms, Retrived from: http://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm Sept 9, 2014.

[3] Wikipedia.org, Airy disk, Retrived from: http://en.wikipedia.org/wiki/Airy_disk Sept. 10, 2014.

No comments:

Post a Comment