Theory Grid_Layers
section‹Line Classification Model (\thy)›
text‹
In the
following, we introduce neural networks for (image) classification by using a
simple line classification problem: given a $2 \times 2$ pixel greyscale image,
the neural network should decide if the image contains a horizontal line (e.g.,
\autoref{fig:h_line}), vertical line (e.g., \autoref{fig:v_line}), or no line
(\autoref{fig:n_line}).
\begin{figure}[ht] % <---
\begin{subfigure}{0.24\textwidth}
\centering
\includegraphics[width=.35\linewidth]{grid-horizontal}
\caption{horizontal line}
\label{fig:h_line}
\end{subfigure}
\hfill % <---
\begin{subfigure}{0.24\textwidth}
\centering
\includegraphics[width=.35\linewidth]{grid-vertical}
\caption{vertical line}
\label{fig:v_line}
\end{subfigure}
\hfill % <---
\begin{subfigure}{0.24\textwidth}
\centering
\includegraphics[width=.35\linewidth]{grid-noline}
\caption{no line}
\label{fig:n_line}
\end{subfigure}
\hfill % <---
\begin{subfigure}{0.24\textwidth}
\centering
\includegraphics[width=.35\linewidth]{grid-misclassification}
\caption{misclassification}
\label{fig:miss_line}
\end{subfigure}
\caption{Example input images to our classification problem.}
\label{fig:grid-net-inputs}
\end{figure}
Traditionally, textbooks (e.g.,~\<^cite>‹"aggarwal:neural:2018"›) define a
feedforward neural network as directed weighted acyclic graphs. The nodes are
called \emph{neurons} and the incoming edges are called \emph{inputs}. For a
given neuron $k$ with $m$ inputs $x_{k_0}$ to $x_{k_{m-1}}$, and the respective
weights $w_{k_0}$ to $w_{k_{m-1}}$ the neuron computes the output
\begin{equation}
\label{weighted-sum-def}
y_{k}=\varphi \left(\beta \sum_{j=0}^{m} w_{k_j} x_{k_j}\right)
\end{equation}
where $\varphi$ is the \emph{activation function} and $\beta$ the \emph{bias}
for the neuron $k$. The values for the weights and biases are determined during
the training (learning) phase, which we omit due to space reasons. In our work,
we assume that the given neural network is already trained, e.g., using the
widely used machine learning framework
TensorFlow~\<^cite>‹"abadi.ea:tensorflow:2015"›.
\autoref{fig:grid-net} illustrates the architecture of our neural network:
The neural network for our example classification problem has four inputs (one for
each pixel of the image), expecting an input value between $0.0$ (white) and
$1.0$ (black).
\begin{figure*}
\centering
\includegraphics[width=0.5\textwidth]{grid-nn}
\caption{Neural network for classifying lines in $2 \times 2$ pixel grey scale images.}
\label{fig:grid-net}
\end{figure*}
It also has three outputs, one for each possible class (horizontal line,
vertical line, no line). The neurons (nodes) can be naturally categorised into
layers, i.e., the \emph{input layer} consisting out of the input nodes and the
\emph{output layer} consisting out of the output nodes. Moreover, our neural
network has one \emph{hidden layer} with 16 neurons. The input layer and the
hidden layer use a linear activation function (i.e., $\varphi(x) = x$) for all
neurons, and the hidden layer uses the binary step function (i.e., $\varphi(x)=0$
for $x \leq 0$ and $\varphi(x) =1$ otherwise). In our
example, there is an edge between each neuron from the previous layer to the
next layer. This is often called a \emph{dense layer}. Machine learning
approaches using neural networks with one or more hidden layers are called
\emph{deep learning}.
In our example, we used the Python API for
TensorFlow~\<^cite>‹"abadi.ea:tensorflow:2015"› to train our neural network. We
obtained neural network that reliably classifies black lines in a given $2
\times 2$ image with 100\% accuracy. While this sounds great, the neural network
is not very resilient to changes to its input values. Consider, for example,
\autoref{fig:miss_line}: a human expert would, very likely, classify this image
as ``no line''. Yet our neural network classifies this as a horizontal line,
even though the right upper pixel is only light grey with a numerical value of 0.05,
much closer to white than to black. Such a misclassification is usually called
an \emph{adversarial example}. If such a network is used in a safety or security
critical applications, e.g., for classifying street signs, such misclassifications
can be life-threatening.
›
theory
Grid_Layers
imports
NN_Layers_List_Main
begin
end