It's a legitimately valid part of machine learning, and its not easy to do for novices.
And I need help putting it on my badger damn it!
If we're talking about a longer format, such as a book, then we might consider digging deeper and implementing as much as possible using the barest of Python requirements. Indeed, Joel Grus does implement everything from scratch in his great (although a bit dated) book https://www.amazon.com/Data-Science-Scratch-Principles-Pytho....
EDIT: This is still a work in progress (and relies on numpy and matplotlib), but here is my version: https://github.com/DataForScience/DeepLearning These notebooks are meant as support for a webinar so they might not be the clearest as standalone, but you also have the slides there.
https://www.amazon.com/Data-Science-Scratch-Principles-Pytho...
But maybe it’s educational to do once if you never have before.
The problem is it's extremely hard to make it efficient. Dozens of men-years are spent trying to optimize linear algebra libraries. There are handful linalg libraries that have competitive performance. It was my college project to make a fast linalg library, and boy it is fast. There are some things like matrix multiplication that if you implement in C with the trivial algorithm, takes >2 mins but with some tricks you can make it as fast as <second (vectorization, OpenMP, handwritten assembly, automatically optimized code, various optimizations, better algorithm.....).
So, if you want to implement linalg in some language and compile it, go ahead, more power to you. But it's basically impossible to do it efficiently. My opinion is: this is fine and we should do this. There should be linalg libraries written in pure python (and are 1000x slower than lapack) but just understand that it's impossible to satisfy all use cases of numpy this way (at least currently).
Perhaps it is overkill. It's just not actually from scratch without it, you know?
If you're not used to work with matrices simply reading the Wikipedia article might tell you enough to implement them yourself.
Or, just download a fast BLAS from your hardware vendor...
I'm a C# developer and I'm sure it would take me all of about 30 seconds to install a matrix multiplication package through nuget. I'm sure it would be immediately obvious how to add items to matrices or do a dot product.
It was dead easy to get code examples as needed.
So much learning that we're missing by not going through this step.
It uses Octave - but you first do everything (in the section on NN) "by hand" - building and looping for the matrix operations. Only after you've gone that far, does he (Ng) introduce the fact that Octave has vector/matrix primitives...
I took the original ML Class in the Fall of 2011; it was a great class, and opened my eyes a great deal on the topic of machine learning and neural networks, which I had struggled with understanding in the past (mainly on what and how backprop worked).
1. https://www.amazon.com/Make-Your-Own-Neural-Network/dp/15308...
Here's an example of what one student of the ML Class built, after being inspired by what he was learning and videos that played during the course:
https://blog.davidsingleton.org/nnrccar/
It kinda shocked me at the time, because I knew quite a bit about ALVINN from books and articles I had read as a teenager in the 80s and 90s. This guy had created the same thing using a cell phone and a cheap RC vehicle! Ok, there was also an Arduino and computer involved - but it really hit home the fact that technology around neural networks had advanced quite a bit!
I also took the other course, "AI Class", but due to personal issues I had to drop out about halfway through.
The next year, after Udacity started, they introduced a course similar to AI Class called "How to Build Your Own Self-Driving Vehicle" (it's called something else today - something like "Robotics and Artificial Intelligence 302" or something like that).
That class was done in Python, and taught me even more about AI/ML - with a focus towards self-driving vehicles of course. Things I learned about that I struggled with or had no real concepts of before:
1. SLAM (Simultaneous Localization and Mapping) 2. Path Finding algorithms (A* and the like) 3. Kalman Filtering (what it is for, how it works) 4. PID Algorithm (how to implement and tune it) 5. More neural network stuff...
...and many other things. Another very excellent and free course to take if you're interested in learning this stuff.
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagat...
It was detailed enough for me to do all the calculations in an excel workbook, 1 complete cycle (forward, backward, and forward with the learned weights)
https://1drv.ms/x/s!Ar06sKFtc9d7goR5WQLo-RkB0XvWAA
Which allowed me to play with the name and factors to understand better how they impact the network as a whole.
https://jsdw.me/posts/neural-nets/
I found that I had to read a bunch of these things to really grasp them myself.