My own experience basically agrees: I’ve read and enjoy Skiena, it’s written in clear style and it’s the “cover to cover” text for a working developer or for interview practice. But I also have TAOCP and CLRS on my shelf, and I haven’t read either of them. I’ve certainly used both of them a lot, but I simply haven’t read all of them because I use them more as reference texts.
Personally I find it bothersome that these textbooks are written with idiosyncratic pseudocode. In my opinion, many of these authors lose their grasp of common implementation difficulties if they don’t provide students with working code that will compile and run. It’s easy to throw down the theory and some okay-ish pseudocode while in effect saying, “...and the rest is just a minor implementation detail, which I leave as an exercise to the reader...”
A while back, I played around with Earley parsing. The Earley parsing algorithm, the basic one, is pretty simple. In fact, it's almost recursive descent, if you didn't use recursion, but managed the current state of the parse manually.
Unfortunately, the original algorithm was described as dynamic programming, which was the hot new thing at the time. Then, there's a bug in it, it's hard to recover a parse tree from it, and it can be extended to be efficient on left-recursive (?) grammars.
Translating the algorithm out of the dynamic programming world isn't too hard. There's a reasonable description of fixing the bug. But recovering a parse tree (technically, a "Shared Packed Parse Forest", since Earley is context-free and to avoid exponential blow-up when producing multiple parse trees, the trees need to share structure) killed me. That paper's pseudocode was the worse spaghetti I've seen in a long time.
I think I finally managed to get the SPPF to get the right structure now, but it is way harder to penetrate compared to the complexity of the underlying idea.
Now I only have to figure out how to actually iterate over the different parse trees..
I don’t feel that’s really the case with TAOCP. It’s machine language for a fictional machine, but there exist emulators for the machine. The code is runnable.
I'm curious if anyone else experienced this or could point out where I went wrong. I really wanted to do the class and I liked the lectures, but i started it after finishing Gardner's Stanford algorithms Coursera class which had some of my favorite exercises ever. They had you write an algorithm in it's entirety, gave you sample data sets to check against, and let you write it in any language. Compared with that, Sedgwick's assignments were like coding with one hand tied behind my back.
If I may ask, what is your line of work? If it requires reading heavy algorithms stuff, it is probably an interesting job!
+ Assembly language level instruction execution is a good way of talking about the running time of algorithms at a finer level of detail than Big O notation. Finer grain than Big O is helpful when analyzing and optimizing programs.
+ MIX is a good abstraction of the Von Neumann architecture and that's where the analysis of actual programs occurs.
+ MIX programs are still accessible to a general auidence 50 years after they were written. No actual programming language available in 1967 is still similarly accessible to such a degree.
+ MMIX is the successor of MIX...but in so far as learning MIX is an issue, it's more complex child probably is not an improvement.
The art of MIX is that it keeps the reader's focus on the fact that computer programs ultimately load registers, read memory, execute mathematical primatives, and branch no matter what object-oriented, functional, or type safe abstractions a high level language might offer.
Knuth gives MIX programs only when it makes sense for a specific purpose (“…makes it possible to carry out meaningful studies of the effects of cache and RAM size and other hardware characteristics (memory speed, pipelining, multiple issue, lookaside buffers, the size of cache blocks, etc.) when comparing different schemes.”). He even gives C code sometimes, e.g. in Volume 2 when presenting a summary of the chapter on random numbers, there are five C snippets on pages 185–188.
Anyone who doesn't want to learn MIX can just ignore those few parts of TAOCP; there's enough value in those books without them. A good place to start may be the most recent fascicles[2].
On the other hand outside of OS and embedded systems people very few people regularly look at assembler, so for many people it's no longer an effective lingua franca for sample implementations. It's a bit like reading Principia Mathematica in the original Latin--sure, some people can do it but it's not the best way to transfer information to a broad audience. Personally I would be happy with some variant of C--the syntax is very broadly understood and the language is close enough to the hardware that implementation concepts come across clearly.
At the time, 'best practice' might have been Algol 60 or maybe Algol 68 because they were what it meant to be a standard language in those days even though implementations varied widely.
[1]: Whether C would have been a reasonable choice the time of the fourth (2011) volume is another matter.
Make that OS, embedded systems, and standard platform runtime libraries. The people implementing the algorithms in your programming-language-of-choice's stdlib certainly know what those algorithms are compiling down to. And they're the vast majority of the people who write out algorithm implementations at all, any more.
If I may ask, what do you work in that requires reading heavy algorithms stuff? (sounds like an interesting job!)
1) even the most abstract things are still a bunch of tiny operations; but with structure
2) I peeked at Sutherland Sketchpad Thesis, and it's mostly 1).. full of 60s machine assembly but, it's nicely structured, very regular. I had probably more pleasure reading this than all of the OOP book I know combined.
As other said, lots have CLRS as a reference bible to pick a solution or read about one subject or one idea in particular.
1. Breaking down a problem statement into a heuristic pattern,
2. Analyzing the best, average and worst case complexities of possible solutions,
3. Mapping the heuristic(s) to the data structures and algorithms with the most optimal complexity,
4. Implementing the code and having it successfully run, with runtime performance feedback.
I think the aforementioned competition websites are better than things like Project Euler because solving the problem requires running actual code instead of just giving a correct answer. That makes them much more interactive (and harder, in my opinion), because you might have to obey particular performance or complexity constraints, and you can receive feedback about how efficient your solution is.
I wouldn't say that practicing these problems will make you a better software developer, in the sense that you can develop maintainable software to solve business problems in a team setting with a large codebase, and I make no comment on whether these sorts of problems are optimal filters for tech interviews (that's a dead horse). But much like mathematics, programming (and specifically algorithm analysis) is not a spectator sport. You can't efficiently learn the material just by reading it, you have to do it, just as mathematical maturity comes about by solving many mathematical problems in different domains. To that particular end, I would say that competitive programming is about as perfect a formulation of practice as you can get for improving algorithmic problem-solving skills.
I know. Apart from it being fun, I'm mostly interested for the social-proof value in interviews. I don't want to beat the dead horse either, but it's definitely a socially useful skill.
Thanks for the suggestions.
Also, for practicing programers, studying data structures is more important than algorithms. I often refer to Sally Goldman's book A Practical Guide to Data Structures and Algorithms Using Java for this.
I agree that Sedgewick belongs in this comparison, but I can't fault the author for not having read it. I think it's the book I would most easily recommend to others; it's quite clear and has lots of very nice visualizations. I do think that Skiena deserves a special mention for explicitly being about how to come up with your own fundamental algorithms and data structures rather than just plug an existing one into your program.
Skiena on the other hand does a nice job of both describing the algorithm in a straightforward manner and also getting you into the algorithm headspace.
There's also this excellent free draft on analysis of common undergrad algorithms for parallelism http://www.parallel-algorithms-book.com/
TAOCP is more than an algorithms book, Knuth even have strategies for writing lengthy programs from scratch, how to build test libraries, how to optimize a program to make it cache memory friendly ect.
They give a brilliant presentation of Shor's factorization algorithm, and how quantum computers offer an amazingly natural way to do FFT's (a key aspect of the Shor algorithm).
I would not contest the OP's questioning of the choice of a chapter on QC in an undergrad algorithms textbook. I would, however, offer the standard "your mileage may vary" caveat to the OP's very negative characterization. I had no idea what QC was about, and this chapter provided me with a great "on-ramp" to understanding what QC is all about.
Have read only parts of Knuth / CLRS - They are good, but for real problems analysis , would prefer Sienna
Shame on me for admitting this! In fact, despite having personal CS favors like everyone, the painfully obvious subjectivity of the whole matter has always striked me as entirely futile to be taken into account for basically everything. I even worked for years of uninterrupted peace with people who would, for example, prototype pointer arguments of c/c++ functions gluing the wildcard to the type, then space then boom variable name. I've always been cool with this, even reading things like glibc's code.
This liberalism however has two exceptions: my own code obviously, a dictatorship where merciless enforcement of inflexible and rigorous coding style is accepted. But unfortunately, in algorithm books too! I know it is completely idiotic but I'd be in denial not to admit how much of a total turn off the coding style of code in algorithm books can be to me. Segdewick for example, such a wonderful book, the prose is excellent, the content really is outstanding (probably one of the most comprehensive I own), the editions I have even has a superb typeface and paper quality, unfortunately the C coding style of this book has this effect on me which makes me want to close the book immediately. I feel the same deep pain every time I have to look at it (which does happen a lot since it is a great book really!). I'm actually jealous of those more advanced human beings who are able to make an abstraction of that when reading a coding book!
However, the other books are probably a better match for a standard university algorithm course curriculum.