An introduction to Neural Networks


Neural networks or Artificial Neural Networks (ANNs) is one of the essential parts of Machine learning.

Mankind is trying to build as smart computer as they are or even more since long. And we are succeeded at some extent and moving towards creation of even more intelligent machines. But, the question is how can one build a system as smart as human ? What if we make a machine which can think in the same way as we do.This can be done if we give the machine a brain (with neurons) as we have. Well, this is what exactly we do in Artificial Neural Networks. We build artificial neurons and process the data in some way as our brain will do, and after processing our data with these artificial neutrons we will possibly get what we are looking for.

Let’s see how our brain works?


Human contains billions of cells called neurons. These cells are connected to each other forming a Large and complex networks like this.



When the cell body fires, it has an electrical impulse, it travels down the axon and then causes across the synapses excitation to occur on other neurons which themselves can fire again by setting out spike trains and so they’re very much a kind of a computational unit and they’re very  complicated.

Artificial neural network also works in somewhat similar way.

Let’s see how artificial neural network works



x (and a +1 intercept term), and outputs hW,b(x)=f(WTx)=f(∑3i=1Wixi+b)hW,b(x)=f(WTx)=f(∑i=13Wixi+b), where f:RRf:ℜ↦ℜ is called the activation function.

This “neuron” is a computational unit that takes as input x1,x2,x3 (and a +1 intercept term), and outputs

where f:RRf:ℜ↦ℜ is called the activation function.

A neural network is formed when we put together many of such simple neurons. it contains many layers

and output of one layer may act as input of another layer.



here, x1,x2,x3 are the input and "+1" is bias unit and corresponds to intercept term.

The leftmost layer acts as input layer and rightmost layer as output layer. And the intermediate layers are called hidden layer, as we don't know about the values they contain.

Let's say we have functions namely f() , g() and h(). where ,

We can conclude that  the output for above network is h(g(f(x))). And this is called feed-forward network (as we are processing our input while moving in forward direction).



We provide input(Xi) to input layer and assign some random weights(Wi) to them. for layer L2 , the product of input and weight is provided as input. and similar way , input for subsequent layer is provided. After processing our input data , we compare the output with actual output (we already have sample input/output).

There can be difference between these values. This error is caused due to assignment of random weights. We can reduce the error by changing weights in previous layer. When we change weights in L3 , we need to adjust the weight in L2 also to compensate changes in weight of L3. and we do the same thing till we reach input layer. and this ways of adjusting weights in each layer is called back propagation.


For more details you can follow these articles :


[This article contains reference from various sources from internet.]