The way to distinguish this from the #5 situation in the article is to ask if you're dropping features because they're hard or because nobody uses them. The former is a red flag; the latter is a green flag. Before you embark on a rebuild, you should have solid data (ideally backed up by logs) about which features your users are using, which ones they care about, which ones are "nice to haves", which ones were very necessary to get to the stage you're at now but have lost their importance in the current business environment, and which ones were outright mistakes. And you should be able to identify at least half a dozen features in the last 3 categories that you can commit to cutting. Otherwise it's likely that the rewrite will contain all the complexity of the original system, but without the institutional knowledge built up on how to manage that complexity.
This is so important. I've been on many a project where, 3 months in, we wish we had historical tracking data on user activity to back up our instincts to cut a particular feature that seems worthless. The worst part? Even if you add it immediately, you'll have to wait 2-4 weeks to get a sufficient amount of data.
I think this was the problem a product like Heap [1] was designed to solve: just track all user actions, forever, and then assign pipelines after the fact based on what you want to check up on.
Don't work at Heap or anything, just love the team and product.
A good example is MS Office, there are an huge amount of features that only 5% of users might ever use, but the majority of users are likely to use quite a few of these niches individually, and if you remove all the low use features, you piss off basicly everyone.
I think the mistaken idea of an average user is why a lot of metrics driven software seems to get more and more useless with every update.
(I cant see the present/away status of contacts in the newest skype, really guys? )
Ideally, you disable them in the old software, and observe how many people complain.
Too often, product management commits to cutting a feature, and then caves in when paying customers complain. It's best to know in advance which category a feature really falls in.
Sometimes you will want to fold features into a rewrite (remove prompting the user to confirm X twice) sometimes this will ease development and be worth it but other times it'll pay off to just retain the old functionality but add it to a list to be user tested later.
Once the tech is solidly over then take a swing at updating the poor UI, do it agiley so you can back out of changes that the user base rejects since (at least within my more modest usage studies) not everything people depend on comes up or gets reported. I'd much rather rollback a design feature branch then have users get change fatigue when you're forced to rollback your new shiny rebuild and the whole project ends up being shelved.
I think this is most important. A lot of people want to rewrite because they don't understand the current system and don't want to bother learning. Before you rewrite you really should understand the current state deeply.
If you can build that plan, and make the case that it will be easier to do the full rewrite, go for it. But if you couldn't put together the fix-in-place plan, you might not understand everything the old system does well enough to actually estimate the size of a rewrite...
(This isn't solely for full-parity rewrites: if you're dropping features, what does that look like dropping from the old system?)
A year into the process one of the c-level leaders pulled me into a room and asked why I couldn't fix the legacy code, and I basically told him that he should have pushed back on it. I couldn't fix the legacy code because that would be months of refactoring that should have been done instead of the rewrite.
Context: the legacy code had some design flaws that required major refactoring, but the legacy code "worked" except for very large deployments. The only problem was that the legacy system wasn't modular, so it didn't have unit tests and wasn't cross platform. All of those problems are easier to tackle via refactoring instead of a full rewrite.
Hmm... there have been a number of times when I've banged my head against the wall trying to figure out how to make my own code do something, until I finally bit the bullet and decided to rewrite the entire chunk from scratch and suddenly it took a fraction of the time I had spent trying to fix it to get it written and working. Not sure how to reconcile this with the advice you gave.
https://en.wikipedia.org/wiki/G._K._Chesterton#Chesterton's_...
You can't blindly listen to the experts.
#4 also mixes a good deal with #5 in that any changes you make (even purely good ones in your view) will require retraining of users and cause a kerfuffle when rolled out to your user base, people _hate_ change.
With all respect, that means you should not be in a position to rewrite legacy code, or to commit others to such a rewrite.
If all the experts you have worked with have been, in your eyes, overly attached to the old way of doing things, you have one of two issues:
- You have not had enough experience in the field, and have not worked with experts that actually have perspective about when/how to rewrite, abandon, or rework their code.
- You have dogmatically condemned people who think that the latest-and-greatest tech may not be a good solution to the problems at hand to the "old fogey" bin.
Either issue means you're not ready to make decisions at this level. Learn more. Research more. Watch more. Listen more.
Weirdly, gaining this perspective has less to do (in my experience) with years on the job, and more with diversity of team/business environments worked in.
You need that previous knowledge to know the "why" of things & if that why is still valid.
IMHO it's more dangerous if you're working with experts who don't want to improve the system.
I’ve always been a big believer in rebuilding your product from the ground up. I think it’s something you should always have going on in the background. Just a couple of devs whose job it is to try and rebuild your thing from scratch. Maybe you’ll never use the new version. But I think it’s a great way to better understand your product and make sure there’s no dark corners that no one dare touch because they don’t understand what it does, how it does it, or why it does it the way it does.
And I’ve always believed that if you don’t want to rebuild your app from scratch, then don’t worry, a competitor will do it for you.
So I agree with every point raised in this article. And I think it does a great job of articulating the issues that often go unspoken. But I’d like to add one more. And for me, this is the biggest issue for any company wanting to rebuild it’s product.
If your sales team has more clout than your designers and developers, then you’re fucked. And in the enterprise software world, this is the norm. An uncheked sales team that get’s whatever it wants has already killed your product and made it impossible to rebuild. Their demands are ad-hoc, nonsensical, and always urgent. So urgent that proper testing and documentation are not valid reasons to prevent a release. Their demands are driven by their sales targets, and the promises they make to clients are born out of ignorance of what what your product does, and how it does it.
This is not true of all companies. Many companies find a reasonable balance between the insatiable demands of a sales force and the weary cautiousness of their engineers. But if your company submits to every wish and whim of your sales team, and you attempt to rebuild your product, then you’re screwed.
What's your learning process? If you don't do maintenance how do you know your rebuilds aren't creating the same problems that lead to the systems needing replacement?
I've got a very well founded distrust of people that only work on green field projects, they're generally responsible for the system's that need rebuilding.
I don't appreciate the snark in your comment.
Incremental rebuilds are not sexy. Adding unit tests to legacy code (thereby making it not legacy code according to Michael Feathers) is not sexy. Sticking with the tried and true technology is not sexy. But they are typically the most successful approaches for those not compensated for changing things for change's sake.
Their time is much better spend working on improving the "legacy" codebase. Simple refactoring and splitting the codebase in a modular fashion, mean you can work on limited parts of the system in isolation. This makes incremental improvements and switch to new tech much easier, and certainly less risky than a rewrite.
I mean, you can write a bunch of pinning tests, then try to prise out various bits and pieces, sure.
But what if all the stuff you're trying to prise out can now be accomplished with a few open source libraries that didn't exist way back, with a very simple rewrite of your business logic on the top?
That's a situation I've encountered quite a few times - a lot of legacy code that's largely boilerplate, with business logic drizzled over the lot, oozing into the little cracks.
That may be good value for big established corporates, but for startups and smaller companies I don't think it is.
Well said. This is easily my #1 biggest pain point as a developer.
Hahaha. Just a couple of devs?
It’s just R&D. It’s not an exotic idea.
If your goal moves from feature comparable but on a modern platform, to new features, to a complete reinventing of the product all without actually shipping ... you might be in trouble.
I had a rebuild go 6 months over. In the heated executive meeting at t+3 months I was called to defend my team and pointed out that the VP Product had just delivered “final” specs literally the day before. How could we be on track with development if PM is 3 months past “end of development” with design specifications. The fact that the specs were changing weekly because “we’re agile” is a whole other issue.
The article touches on that too; simplified it's stating that if you're not live within 6 months, you're doing waterfall.
Waterfall isn’t just a synonym for “the wrong way to do it” :-)
So you rebuild as a new system as a gamble, because even though it shows all the traits described, the new system is at least one that anyone is willing to develop, and one where features can be added, and to which people can be recruited.
We know big rebuilds have small chances of sucess. But that doesn’t mean you shouldn’t do big rewrites. You are in a bad place if you even consider. Maybe the big rewrite means the company has an 80% risk of going under. Still could be that safe bet.
As a developer you're constantly fighting managers who want to rush things to get them out and who will eventually blame you for a bug/non-defined behavior once you hit a certain milestone.
To me it seems the author of the article doesn't understand the tech debt. If you've ever worked in a startup you'd know that the requirements are ever-changing, thus that if a certain payment system is put in place, it might evolve to the point where you really need to refactor it and in order to enable the refactor you have to refactor the whole business flow as well. If there's more than 2-3 features affected by a new feature, a big refactor is definitely needed.
Only one solution offered, which I dont think is adequate because why would I leave something in that was only meant to provide value for short term and then build on top of it till I kill the old system?
For example “we spend X/year on AWS but if we spend Y to rewrite in C++ we need fewer VMs and can cut that to Z/year” is simple calculations. If your engineers can’t even do that, their motives are suspect.
The reference to Martin Fowler’s strangler pattern (https://www.martinfowler.com/bliki/StranglerApplication.html) was mentioned in the article to grow the new system in the same codebase until the old system is strangled. In my case (Ionic 1 to 2) however, both the entire framework and the language are different. How should the strangler pattern work in this case?
Identify key components and subsystems and rewrite them one by one. From the outside you seem to be switching over one REST endpoint after the other, but of course internally it's a bit more difficult, but applications often enough have enough parts that are not SO intertwined that you can do stuff like this. It's a bit related to how you break up a monolith. Find bigger, less coupled parts and shave them off and just touch the glue code.
Sorry for the abstract reference here, but it applies to almost any replatforming out there. In most cases it is a very expensive operation for a business and needs some major reasons in order to justify such a move.
1: https://www.martinfowler.com/bliki/EventInterception.html
My firm belief is that when you need a rebuild, you are already well into a fail state as a company. Not to stay there can be no recovery, but it is an indication of some deep problems for the company, beyond anything the engineering department alone can resolve... and if the rebuild is not coming from the executive leadership, it is an even bigger issue as it will more likely lead to bigger problems than it will solve.
I've become a member of a team the company scrambled to deal with a `legacy` python/SQL - based ingestion/storage system in an effort to 'harden' it. Despite my best efforts, we are going for a full rewrite into java/spring/avro/mongo/es. We have internal users talking SQL and utilising the system at the moment, a fair amount of data is relational.
I have run out of ideas how to convince the team and stakeholders, will have a one-shot chance to talk to VP. Any ideas how to voice the concerns about the full re-design (perhaps I'm just being difficult)?
2. Consider 'what the point' is in the first place, because the entire world could be run on python/SQL and it would be 'hard'. I don't think anyone would consider 'Mongo' to be 'hard' usually people use it because it's fast and easy, not hard. Consider maybe only replacing one part at a time, i.e. Java-SQL.
3. Consider a simple clean up or refactor. No need to learn no languages and tools when maybe you just need a house clean.
4. People seem to be going back to SQL because of it's inherent standardization - so many reporting and analysis systems use SQL as an interface, to the point where even NoSQLs are starting to use SQL.
In fact, I thought I was. We split our app into 3 parts, rebuilt part 1, then part 2, but part 1 couldn't be released to customers until part 2 was done, and we kept our legacy system supporting the majority of our users until we are done with part 3, which is nearing completion now.
I thought that was "replacing one piece at a time", but it isn't most users aren't touching it until part 3 is done, and at that point, they are experiencing a new system from scratch.
If users speak SQL, they will reject Mongo. The users of the system are the ones who will determine project success or failure.
Think about the data analysts, product owners, etc. who use the system. Interview them. Find out exactly how they use the system currently. Do they query in an ad hoc way? Do they rapidly iterate on their queries? Watch them interact with the system. If it's any way other than through dashboards that an engineer updates on request, you are in for rough seas.
Users must always determine the contours of a new system. There are big data solutions that speak SQL. Some are cloud-based, some are not. Some are faster than others. The team should be able to show you why they rejected those as solutions.
Using the normal sense of "rebuild" didn't make sense.
There are some legitimate cases where you really should be rebuilding.
You may not have seen such a case since they are rare, but they do exist.
A good rule of thumb is to try your absolute best to avoid a rebuild. If at the end of your hard work you still feel defeated and forced to go with the rebuild option, you probably should rebuild.
Sometimes a rebuild is just necessary, because you are on a tech stack that is no longer working for you, for whatever reason. How would you solve that kind of problem?
It could also function pretty much like a nosql db initially, to ease your transition, then you could migrate gradually to using it as a relational db. You need strong checks on data integrity before you start - you could consider double writing (to old orm using nosql + new orm using psql), and comparing data stored to be sure you don't miss anything at first, before you switch?
Here is a video with more detail: https://www.youtube.com/watch?v=dQw4w9WgXcQ
I would start by firing people that led to this situation.
You are one of those blessed people who can architect a system and the architecture holds up for decades. From my experience most systems will end up in a big mess over time if features get added. There is almost no way around it.
This is exactly why maintenance is needed. Proper maintenance that includes things like updating the architecture and gradually migrating the whole system to that architecture, rebuilding small unwieldy components, updating and migrating database schemas as the product evolves, removing unused features.
If a product is just getting bugs patched and nothing else then it isn't really being maintained, it's being deprecated. Unfortunately as an industry we still think that there are distinct build and maintenance phases and that the latter can be done with less resources.
Thereby fomenting Red Flag #4, not "working with people who were experts in the old system.”
For example an executive/management team that over-commits the organisation and creates a culture of rewarding technical debt and punishing maintainers.
Rather than fixing these issues they will continually search for a super hero employee who is going to come in on a white horse on monday and fix it all up in two weeks.