%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%2345678901234567890123456789012345678901234567890123456789012345678901234567890
% 1 2 3 4 5 6 7 8
\documentclass[letterpaper, 10 pt, conference]{ieeeconf} % Comment this line out
% if you need a4paper
%\documentclass[a4paper, 10pt, conference]{ieeeconf} % Use this line for a4
% paper
\IEEEoverridecommandlockouts % This command is only
% needed if you want to
% use the \thanks command
\overrideIEEEmargins
% See the \addtolength command later in the file to balance the column lengths
% on the last page of the document
% The following packages can be found on http:\\www.ctan.org
%\usepackage{graphics} % for pdf, bitmapped graphics files
%\usepackage{epsfig} % for postscript graphics files
%\usepackage{mathptmx} % assumes new font selection scheme installed
%\usepackage{times} % assumes new font selection scheme installed
%\usepackage{amsmath} % assumes amsmath package installed
%\usepackage{amssymb} % assumes amsmath package installed
\title{\LARGE \bf
Shape feature extraction for image recognition with CNN using frequency domain
}
%\author{ \parbox{3 in}{\centering Huibert Kwakernaak*
% \thanks{*Use the $\backslash$thanks command to put information here}\\
% Faculty of Electrical Engineering, Mathematics and Computer Science\\
% University of Twente\\
% 7500 AE Enschede, The Netherlands\\
% {\tt\small h.kwakernaak@autsubmit.com}}
% \hspace*{ 0.5 in}
% \parbox{3 in}{ \centering Pradeep Misra**
% \thanks{**The footnote marks may be inserted manually}\\
% Department of Electrical Engineering \\
% Wright State University\\
% Dayton, OH 45435, USA\\
% {\tt\small pmisra@cs.wright.edu}}
%}
\author{Palash Dahiphale% <-this % stops a space
\thanks{*This work was not supported by any organization}% <-this % stops a space
\thanks{
{\tt\small}}%
\thanks{
{\tt\small}}%
}
\begin{document}
\maketitle
\thispagestyle{empty}
\pagestyle{empty}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{abstract}
Learning shape features robustly using CNN with frequency domain deep learning for accurate image recognition.
\end{abstract}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{INTRODUCTION}
In present days, a lot of computer vision tasks are being done by deep learning models (most popularly CNN) but, with lower computational speed and lower accuracy specially for highlighted images. Discrete Fourier transforms provide a significant speedup in the computation of convolutions in deep learning. In this work, i demonstrate that, beyond its advantages for efficient computation, the spectral domain also provides a powerful representation in which to model and train convolutional neural networks (CNNs).
\section{PROCEDURE FOR REPORT SUBMISSION}
\subsection{Why frequency domain ?}
One powerful property of frequency analysis is the operator duality between convolution in the spatial domain and element-wise multiplication in the spectral domain, which gives considerable computational ease. Moreover, the key shape information of an image is in the phase part of an image (Ref : Experiment-1) which can be extracted using frequency analysis as well as denoising of an image is an easy task in frequency domain (Ref : Experimrnt-2). The unitarity of the Fourier basis makes it convenient for the analysis of approximation loss. More specifically, Parseval’s Theorem links the l2 loss between any input x and its approximation xˆ to the corresponding loss in the frequency domain.
\subsection{Applications of spectral representations }
First one is spectral parametrization. In spectral parametrization, we propose the idea of learning the filters of CNNs directly in the fre- quency domain. Namely, we parametrize them as maps of complex numbers, whose discrete Fourier transforms correspond to the usual filter representations in the spatial domain. Second one is spectral pooling. Pooling refers to dimensionality reduction used in CNNs to impose a capacity bottleneck and facilitate computation. We can use the new approach to pooling called as spectral pooling. It performs dimensionality reduction by projecting onto the frequency basis set and then truncating the representation.
\section{EXPERIMENTATION}
I have demonstrated some experiments to support my assumptions. Further, using that assumption i have checked speed and accuracy of the particular model and done some analysis on the results. To remove the difficulties faced, again i have done some more experimentation. Using all the experiments i have done, i further checked speed and accuracy of an improved model.
\begin{itemize}
\item{Experiment-1} To support the assumption that key information of an image lies in the phase part of an image, i have taken two images and computed FFT of both the images to separate out phase and magnitude part of it. I combined Magnitude part of image-1 with phase part of image-2 and vice versa, further i reconstructed that combined images by taking inverse FFT. Results supported our assumption.
\item{Experiment-2} To support the assumption that denoising can be done through frequency domain analysis which will help to separate out actual containt of an image. I have taken an image and passed through the high pass filter. Result shows outline of an image that means more information of an image in the lower frequency range of an image which supports our assumption.
\item{Experiment-3} In this experiment, i have computed the FFTs of an image and feed it the input of an CNN instead of an whole image. Results showed me that trained data require considerable larger storage and slower computational speed.
\item{Experiment-4} After doing some analysis on results of the previous experiment, i have landed on conclusion that improper pooling strategy leads to larger memory requirements. In this experiment, i have demonstrated that how use of spectral pooing avoids the problem of larger storage.
\item{Experiment-5} After doing some analysis on results of the experiment-3, i have landed on the conclusion that problem of slower computational speed is because of learning of filters of the CNN in spatial domain. In this experiment i have demonstrated that why to use spectral parametrization to avoid problem of slower computational speed.
\item{Experiment-6} After analyzing the above problems, i have used both the solution to train the model for CIFAR-10 data-set and got some interesting results i.e 81.24 percent accuracy with less no of epochs.
\end{itemize}
\addtolength{\textheight}{-12cm} % This command serves to balance the column lengths
% on the last page of the document manually. It shortens
% the textheight of the last page by a suitable amount.
% This command does not take effect until the next page
% so it should come on the page before the last. Make
% sure that you do not shorten the textheight too much.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{ACKNOWLEDGMENT}
I would like to thanks Dr. Subrat Kar sir for useful discussions and assistance throughout this project. I also want to thanks Mr. Manohar kumar sir for giving me the direction and outline of the project.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
References are important to the reader; therefore, each citation must be complete and correct. If at all possible, references should be commonly available publications.
\begin{thebibliography}{99}
\bibitem{c1} Oppenheim's experiment section 2.6 http://cdn.intechopen.com/pdfs
\bibitem{c2} Phase congruency http://homepages.inf.ed.ac.uk
\bibitem{c3} Fast training CNN through FFTs https://arxiv.org/abs/1312.5851.
\bibitem{c4} Spectral representation of CNN https://arxiv.org/abs/1506.03767.
\bibitem{c5} Fourier CNN http://ecmlpkdd2017.ijs.si/papers/paperID11.pdf.
\bibitem{c6} Fast Convolutional Nets With fbfft: A GPU Performance Evaluation https://arxiv.org/abs/1412.7580.
\end{thebibliography}
\end{document}