# Derivation: Error Backpropagation & Gradient Descent for Neural Networks

The material in this post has been migraged with python implementations to my github pages website.

I recently received my PhD from UC Berkeley where I studied computational neuroscience and machine learning.

Posted on September 6, 2014, in Algorithms, Classification, Derivations, Gradient Descent, Machine Learning, Neural Networks, Optimization, Regression, Theory and tagged , , , , , . Bookmark the permalink. 18 Comments.

1. daFeda

Hi, this is the first write-up on backpropagation I actually understand. Thanks.

A few possible bugs:
1. Last part of Eq.8 should I think sum over a_i and not z_i.
2. Between Eq.3 and Eq.4 it should I think be z_k=b_k + … and not z_k=b_j …
3. Last section says Output layer bias while the derivation is for hidden layer bias. Also,
b_i seems to be used as the notation for hidden layer bias while it should be b_j.

All in all, a very helpful post.

2. daFeda

Reblogged this on DaFeda's Blog and commented:
The easiest to follow derivation of backpropagation I’ve come across.

3. Ayan Das

Probably the best derivation of BackProp I’ve ever seen on internet 🙂

4. Devin

Thanks. Nice clean explanation.

5. Arnab Kanti Kar

Thank you !
Second time benefited from your blog ..

6. Donghao Liu

Best introduction about back prop ever!
Thank you so much.

7. Really useful! Though there are a few typos, as daFeda has mentioned.

I just have one small question Im hopeing somebody can answer..

I understand this algebraically, and I understand the iterative patterns created with the deltas when calculating the weights from different layers starting backwards,

but WHY does (a_k – t_k) * the derivative mean that the ERROR (which is equal to a_k – t_k) is being “BACK PROPAGATED”. What’s the intuition behind multiplying by the derivative which makes us saying this.
?

9. Ugenteraan

This really cleared up all the confusions that I had in backpropagation. Thanks a bunch !

10. Thank You ………… dustinstansbury. Finally, I understood back propagation.

11. edmund

25 years ago I had these formulae in my PhD, but I couldn’t retrieve a copy, luckily i found your blog (true story) and your very clear exposition refreshed my memory.

12. Anurag Reddy

A very neat and simple derivation. Great job!!

13. rostys

Thanks, very interesting and helpful article. Great introduction to neural networks for beginners like me