What the heck is Deep Learning?

Hello curious minds👋

In this article, we are going to discuss about Deep Learning. We will start our journey by discussing the human mind, then going through a discussion of concepts in DL, we will end by discussing an amazing experiment related to the rewiring of the brain.

Introduction #

As humans, our decisions depend on the surrounding environment that we live in. We all have unique preferences and tastes for different things. Someone likes cold weather⛄, another one (like me) prefers a warmer climate♨.

The basic structure of the human brain is the same, yet each one of us is unique. This uniqueness is the result of a combination of genetic factors and individual life experiences. With more experience, the formation of connections among neurons gets established.

Neuron #

A Neuron is a fundamental unit of the brain which is responsible for taking input, processing it, and providing the output. Whether that input is sensory input, from the external world or from other neurons. Similarly, the output is for other neurons or the motor commands to the muscles. A Neuron looks like this,

Image Credit: Queensland Brain Institute

A dendrite is where a neuron receives input, the axon is the output structure of the neuron. Based on inputs through dendrites, an action potential is generated and transmitted through axons to other cells.

Let us understand how neuron works through an example of how we make decisions 😉.

How we make decisions #

Let us consider, there are 3 friends who are deciding to go for dinner in a non-veg restaurant 😋🤤, each one of them is deciding whether to have a non-veg meal or not. Based on past experiences and preferences, they ask themselves the following 4 questions to decide whether to go or not.

Am I Vegetarian ?
Recently when I had a Non-veg meal ?
Do I like this meal ?
Do I like this restaurant ?

Everyone will answer the above Yes/No type questions and with their preferences, they will come up with a decision. All this processing happens inside our brain.

Through research, we came up with a mathematical model which is quite similar to a biological neuron. This model is known as Artificial Neuron. With our example, we can visualize the artificial neuron as follows,

The person will answer the questions in yes/no manner, and based on their likes/dislikes the Weights (W1,..,W4) are decided e.g. if a person really dislikes the non-veg meal then W1 will be negative and vice-versa.

The input (x) i.e. answers to the questions are converted into numbers, before feeding to our Artificial Neuron. Here we can simply map YES → 1 and NO → 0.

After this, the inputs Xs and weights Ws are represented as vectors. Then the dot product is calculated between Xs and Ws. That dot product is then compared to a threshold and o/p decision is calculated with a simple function or formula like mentioned below,

If dot_product ≥ Threshold then 1 (True)

else 0 (False)

Artificial Neural Network #

Our brain has almost 100 billion biological neurons and more than a trillion connections between them which gives us cognitive abilities. Similarly, we can create a network of artificial neurons where neurons are connected as shown in the below diagram,

The neurons are arranged in a layered manner. The layer which takes input is known as the input layer similarly the layer which provides an output is known as the output layer and the layers in between are known as hidden layers.

The outputs of one layer are presented as input to further layer. Each connection is weighted with dedicated weight.

In an artificial neuron, there is one more term per neuron which is known as bias term. A bias term is a real number which is added to the result of dot product. You can read more about the bias term here.

While calculating the output of a neuron in the above example of having a Non-Veg meal, we have done a comparison of dot product with threshold and decided output of neuron. This function is known as a step function shown in below fig (a).

We can use all sorts of functions, just to decide whether that neuron will give output or not. These functions are known as Activation Functions.

Two common activation functions are shown in below image,

Image Credit: www.researchgate.net (a) step function (b) sigmoid function

We call weights and biases of ANN as trainable parameters meaning the output of ANN depends upon them. Different sets of parameters yield different results in ANN.

The difference between an untrained network and a trained network is that the trained network’s parameters give very close results to expected results i.e. parameters are optimized to give the best results possible.

We achieve those sets of parameters by training a neural network (We will discuss about training NN in my next blog).

Deep Learning #

The ANNs which have more than one hidden layers are known as Deep Neural Networks aka DNN.

Deep Learning is study of DNN.

So having hidden layers make ANN special 🤯, What is it about them because of which the DNNs are very popular today 🤔?

To understand the role of hidden layers in a neural network let us understand how we solve any complex problem. Well, we solve by breaking them down into simple solvable problems. Then build relations between these solved problems to solve a complex problem.

Similarly using hidden layers gives ANN the ability to break complex problems into simpler ones and then building on top of them to finally solve the complex problem.

Take an example of image processing using ANN, First few layers learn to identify edges, shapes, gradients, etc. from a given image. Then further layers will able to understand corners, texts, objects, etc.

Universal Approximation Theorem #

Now you may ask me that, To solve any complex problem can’t we just increase the number of hidden layers in the network? Yes* you are right but with an asterisk.

We can solve any problem with approximate accuracy using ANN if we have enough trainable parameters in ANN hence they follow UAT. So the above asterisk is for approximately achieving the task results.

Let me end our discussion on topic by sharing an amazing experiment,

When Neuroscientists of MIT were wondering about the question What would happen if we rewire different regions of brain ?

To find the answer, they have rewired the brain of ferrets such as auditory processing of the brain will receive visual signals. Through this experiment, Neuroscientists found out that ferrets can learn to “see” with the auditory processing region of the brain 🤯🤯.

This experiment suggests that the brain of mammals may consist of a single algorithm to perform most of the tasks. Before this hypothesis ML research was more fragmented. As ANN became more popular, using DL to solve all kinds of ML tasks became a standard practice.

Take Away #

If you still didn’t fully understood what we discussed.

The least you should take away from this blog is following 😉

An artificial Neuron is a basic block of ANN whose operation is inspired by a biological neuron.
The network of artificial neurons connected in a layered manner where at least 3 layers(Input, Hidden, Output) are present is known as DNN
Deep Learning is study of DNN
Using DL we can built complex concepts out of simpler concepts.
DL is famous today because ANN can solve any task with approximate accuracy

I hope you are clear with concept of Deep Learning

for more such AI related content, click here.

We will talk about training neural network in next blog.

Till then Stay Safe and Stay Curious 🤔😉