An introduction to Artificial Neural Networks
There are many things that machines and computers perform way better than humans, such as calculating complex math problem, or creating simulations and animations. Robots are able to work in factories with more efficiency and speed, without needing to rest. However, when it comes to common sense, imagination and inspiration, we are still far superior to machines. Since the invention of the computer, the notion that they might be able to think on their own has become increasingly popular. Because of how traditional computers worked, it seemed virtually impossible to create an intelligent system that could recognise human emotion through facial expressions. Or to read a wide variety of texts in cursive handwriting, or to drive autonomously through busy streets. In order to achieve these things, scientists had to create an entirely new type of computing, one based on the structure of the human brain. This new computing system is known as Artificial Neural Networks.
Inspired by the structure of the human brain, artificial neural networks (ANN) provide a viable solution to making computers more human-like and help machines come up with their own reasoning to problems.
ANNs are the main driving force behind Deep Learning. Deep learning is a subset of machine learning, which we discussed in our introductory article. It’s a technique that enables computers to do what comes naturally to humans: learning by example. It is the key technology that allows autonomous vehicles, speech and image recognition.
What are Artificial Neural Networks (ANN)?
“Artificial Neural Networks or ANN is an information processing paradigm that is inspired by the way the biological nervous system such as brain process information. It is composed of large number of highly interconnected processing elements (neurons) working in unison to solve a specific problem.”
Image source: TechCrunch
Neural networks are a set of algorithms, who’s model is inspired by the human brain. Neural networks are designed to recognise and distinguish between patterns. They interpret sensory data through a kind of machine perception, labelling or clustering raw input. The patterns they recognise are numerical, contained in vectors, into which data such as images, sound, or text must be translated.
The neural networks contain a number of hidden layers through which the data is processed, allowing the machine to go “deep” in its learning, making connections and weighting input for the best results. The term ‘deep’ refers to the number of layers in a neural network. While deep neural networks can have as many as two hundred layers, traditional neural networks only have a few, usually around three.
Figure 1: Branches of Artificial Intelligence, one of which is Artificial Neural Networks.
Neural networks were first developed in the 1950s in order to help computers behave as though they are interconnected brain cells, much like in a human brain. An artificial neural network is an attempt to simulate the network of neurons that make up the human brain. This is to give the computer the ability to learn and make its own decisions based on data.
The ability to learn is a fundamental aspect of intelligence. Although it can be a little difficult to precisely define what ‘learning’ is in every context, a learning process in an artificial neural network can be explained as the problem being continuously updated in a network architecture so that the network can efficiently perform a specific task. Performance of a task is improved over time by iteratively updating the weights in the network. Artificial Neural Networks differ from traditional expert systems in the sense that they appear to learn underlying rules from the given collection of representative examples, this is what makes them exciting to work with.
How do Artificial Neural Networks Work?
A typical Neural Network contains a large number of artificial neurons called units arranged in a series of layers. The input layer is where rules are predetermined and representative examples are given to show the ANN what the output should look like. Hidden layers are where the input is processed and “broken down”. These layers are shown in the figure 2.
Figure 2: Neural network
A typical Neural Network contains a large number of artificial neurons called units arranged in a series of layers. These layers are shown in the figure 2, above.
The layers can be explained as follows:
- Input layer – It contains those units (Artificial Neurons) which receive input from the outside world on which network will learn, recognise about or otherwise process.
- Output layer – It contains units that respond to the information about how it’s learned any task.
- Hidden layer – These units are in between input and output layers. The job of the hidden layer is to transform the input into something that output unit can use in some way.
Most Neural Networks are fully connected that means to say each hidden neuron is fully linked to every neuron in its previous layer (input) and to the next layer (output) layer.
Looking at an analogy may be helpful in understanding neural networks better. Learning in a neural network closely resembles to how we learn how to do things as humans— we perform an action and are either satisfied or dissatisfied with the result. Unsatisfactory results tend to cause a person to repeat a task until they succeed in achieving the desired result. Similarly, neural networks require a “supervisor” in order to describe what the desired result should be in response to the input.
Most deep learning methods use neural network architectures, which is why it is often referred to as deep neural networks. Deep learning models are trained by feeding them large amounts of labeled data and neural network architectures that learn features directly from the data without the need for manual input from a supervisor.
In deep-learning networks, each layer of nodes learns on a distinct set of features based on the previous layer’s output. The further you advance into the neural net, the more complex the features your nodes can recognise, since they aggregate and recombine features from the previous layer. This is known as feature hierarchy which is a hierarchy of increasing complexity with each layer.
Based on the difference between the actual value and the predicted value, an error value which is also called Cost Function is calculated and sent back through the system.
Cost Function: One half of the squared difference between actual and output value.
For each layer of the network, the cost function is analysed and used to adjust the threshold and weights for the next input. Our aim is to minimise the cost function. The lower the cost function, the closer the actual value to the predicted value. In this way, the error keeps becoming marginally lesser in each run as the network learns how to analyse values.
We feed the resulting data back through the entire neural network. The weighted synapses connecting input variables to the neuron are the only thing we have control over.
Figure 3: Information propagation in a neural network
Types of Neural Networks in Artificial Intelligence
Note that the figure below only showcases 7 of the most widely used Neural Networks, but in total there are more than 27 types of neural networks.
Figure 4: Neural Network Architecture Types (image source: xenonstack)
- Perceptron Model in Neural Networks – A perceptron model Neural Network is having two input units and one output units with no hidden layers. These are also known as ‘single layer perceptrons’.
- Radial Basis Function Neural Network – These networks are similar to the feed-forward Neural Network except radial basis function is used as the activation function of these neurons.
- Multilayer Perceptron Neural Network – These networks use more than one hidden layer of neurons, unlike single layer perceptron. These are also known as Deep Feedforward Neural Networks.
- Recurrent Neural Network – Type of Neural Network in which hidden layer neurons has self-connections. Recurrent Neural Networks possess memory. At any instance, hidden layer neuron receives activation from the lower layer as well as its previous activation value.
- Long Short-Term Memory Neural Network (LSTM) – Type of Neural Network in which memory cell is incorporated into hidden layer neurons is called LSTM network.
- Hopfield Network – A fully interconnected network of neurons in which each neuron is connected to every other neuron. The network is trained with input pattern by setting a value of neurons to the desired pattern. Then its weights are computed. The weights are not changed. Once trained for one or more patterns, the network will converge to the learned patterns. It is different from other Neural Networks.
- Boltzmann Machine Neural Network – These networks are similar to the Hopfield network however some neurons are input, whereas others are hidden in nature. The weights are initialised randomly and learn through backpropagation algorithm.
- Convolutional Neural Network – Get a complete overview of Convolutional Neural Networks through our blog Log Analytics with Machine Learning and Deep Learning.
- Modular Neural Network – It is the combined structure of different types of the neural network like a multilayer perceptron, Hopfield Network, Recurrent Neural Network, which are incorporated into a single module into the network to perform independent subtasks of a whole complete Neural Network.
- Physical Neural Network – Physical Neural Networks comprise of electrically adjustable resistance material is used to emulate the function of synapse instead of software simulations performed in the neural network.
Four Different Techniques of ANNs
Classification Neural Network
In a Classification Neural Network, the network can be trained to classify any given patterns or datasets into a predefined class. It uses Feedforward Networks to do this.
Prediction Neural Network
In a Prediction Neural Network, the network can be trained to produce outputs that are expected from a given input. The network ‘learns’ to produce outputs that are similar to the representation examples given in the input
Clustering Neural Network
The Neural network can be used to identify a unique feature of the data and classify them into different categories without any prior knowledge of the data.
Following networks are used for clustering:
- Competitive networks
- Adaptive Resonance Theory Networks
- Kohonen Self-Organising Maps.
Association Neural Network
An Association Neural Network can be trained to remember a particular pattern, which enables any noise patterns presented to the network to be associated with with the closest one in the memory or discard it.
Learning Techniques in Neural Networks
Supervised Learning
In supervised learning, training data is ‘fed’ into the network, and the desired output is already predetermined. In supervised learning, weights are adjusted until production produces desired output.
Unsupervised Learning
The input data is used to train the network whose output is already known. The network classifies the input data and adjusts the weight by feature extraction in input data.
Reinforcement Learning
In reinforcement learning, the value of the output is unknown but the network provides the feedback on whether the output is right or wrong. This type of learning can also be known as semi-supervised learning.
Online Learning
The adjustment of the weight and threshold is made after presenting each training sample to the network.
Offline Learning
The adjustment of the weight vector and threshold is made only after all the training set is presented to the network. It is also called Batch Learning.
What are ANNs used for?
The main purpose of artificial neural networks was to solve problems and come up with reasoning in the same way that a human brain would. However, over time it was realised that ANN could be put to better use if it’s focus were shifted to performing specific tasks. This way, a ANN could be perfected for a specific task more efficiently, such as computer vision, speech recognition, machine translation, social network filtering, etc. Artificial neural networks in deep learning have even enabled machines to perform tasks that were previously thought to have been limited to humans, such as painting and creating music.