Backpropagation is a method for computing gradients of complex (as in complicated) composite functions. It is one of those things that is quite simple once you have figured out how it works (the other being Paxos).
In this document, I will try to explain it in a way that avoids the issues I had with most of the explanations I found in textbooks and on the Internet. Inherently, this means that this document is well-suited to my way of thinking, but my hope is that it may also suit yours or, at the very least, may give you a different perspective as you try to grasp how backpropagation works.