Database Schema as Code
Instead of writing up and down migrations, you define what the end state should look like. Then the computer will figure out how to get here. This is just how the industry started managing server configurations (Puppet) and infrastructure (Terraform).
We use protocol buffers so it was pretty straight forward to have a definition of what our tables should look like. We have a script that figures out what the delta is between two states (either proto files or a db) and can calculate the schema migration SQL (e.g. CREATE TABLE, etc).
From there, we run it through a safety check. Any unsafe migration (either for data loss or performance issues e.g. DROP TABLE) requires an extra approval file.
There's no real difference between an up migration and a down migration (except that one tends to result in an unsafe migrations). It's calculable at CI time so we can give devs a chance to look at what it's going to do and approve any unsafe migrations. API compatability checks enforce that you need to deprecate before you can drop.
DML, that is data changes, are handled via standard check in a sql file and CI will run it before the code deploy and after the schema migration.
Alembic is the one other place I've seen this concept (a couple others mentioned this) so it's not new, but surprised I haven't seen it more places.
[0] Shameless plug: We're hiring if you're interested in changing how healthcare is paid for, delivered, and experienced. https://www.devoted.com/about/tech-jobs/
I'm developing a suite of tools [1] to provide declarative schema management and "schema change by pull request" functionality, initially targeting MySQL and MariaDB. A few large companies have built pipelines using one of my tools -- including Twilio SendGrid, who wrote about their process in-depth recently [2].
[2] https://sendgrid.com/blog/schema-management-with-skeema/
Git would never have worked it required devs to write the up/down patches - why should we have to write the up/down migrations for my schema?
Excited to see more tooling around declarative schema!
This is a (minor) pain point for traditional migration systems.
The concept appeals to me, but it only seems to work for trivial migrations.
I think there’s a whole set of problems to be solved in this space and frankly, I’m as surprised as anyone that given how SQL is declarative, we use procedural code to do migrations so part of my post was hoping people would tell me what tool I should be using or how this approach fails over time. So your examples are really helpful for me as I think through if it’s possible to do automatically, workaround, or get by without.
It seems to me that we just lack the ability to express these transitions mathematically that can help us do them. And of those, there’s probably only a subset which are possible to do without taking downtime.
In particular, the class of migrations that you outlines are a combination of DDL and DML changes and also have quite a bit of code complexity to do without downtime. It’s definitely a current weakness.
For simpler cases, where it may make sense to run a data migration immediately after a schema change, a good generic middle-ground may be configurable hook scripts. A declarative schema management system can then pass relevant info to the hook (which tables were changed, for example) and then the script can run any arbitrary row data diff/apply/migrate type of logic.
I do understand your point though; for relatively straightforward data migrations, an imperative system can capture these much more cleanly by just coupling them with the corresponding schema migration code.
Django allows you to define your models (schema) and then run a command that will generate the migrations. If you don't like the migration that was generated, you can modify it. You can customize up and down operations.
There are also tools that will take an existing database and generate the Django models for you.
All of these operations can also export the exact SQL that will run on each supported RDBMS platform in case you want to be extra sure on what exactly will be executed.
A system like Saltstack or Puppet for databases would not have checked in migrations, these would be generated on the fly at deploy time.
So you could very well have multiple state changes in a single run, by comparing actual DB state and desired DB state, then creating the SQL code as needed for that deployment.
Honestly not having to fiddle with the migrations table on a live server seems pretty nice ;-)
This could very well turn out to be Django's next gen migration tool...
* What happens when your company creates new systems that aren't Python/Django? You can either still shoehorn all migrations into Django models, or have multiple separate schema change processes/pipelines... both options are not good.
* If someone makes an out-of-band schema change manually (either by accident or to do a rapid hotfix), you're no longer on a known version. Not sure about Django, but when this happens, most traditional migration systems cannot operate without additional manual cleanups... whereas declarative tools can inherently transition any state to any other state.
* Depending on the DBMS, with large tables and certain types of ALTERs, using an external online schema change tool is greatly preferable to just running an ALTER directly.
* Does Django support sharding / the notion that a change to a sharded model must be made to multiple shards?
Here's the scenario that I think you're laying out: 1. Commit A creates column foo 2. Commit B has DML that reference column foo 3. Commit C removes column foo
This works fine if our CI deployer does each commit individually. First roll out any schema changes, then run any DML SQL.
However, our deployer might pick up all those changes and since we roll out the schema migrations first (in this case a create + drop -> NOP) and then runs the DML (which will error), this is an issue because of the rollup.
In practice, we have yet to see this case (most of the time, the dev who write the DML is close enough to the code to know if it's going to be dropped soon and we don't drop that many columns - in part because we know that there be dragons) but truthfully, I haven't thought about it much and need to think through what the impact is beyond this example. Thanks for helping me refine my thinking and I'll have something to ponder on this weekend!
It requires a bit more finesse and integration into our code base as it requires multiple deploys with code that knows how to handle both columns.
I have been advocating for this approach for a while now: https://djrobstep.com/talks/your-migrations-are-bad-and-you-...
Spanner (https://cloud.google.com/spanner/) I think can auto-compute the diff between its current state and a given schema, generate appropriate SQL statements to perform a migration and get user confirmation for destructive schema changes.
Yes, and CFEngine pioneered this in 1993 (Mark Burgess).
I make my living as a CFEngine consultant.
"oh no! our migration deleted columns without re-representing that data in the new manner, and our users have already done changes to the database so we can't simply restore from a backup!"
quick!
dbgit checkout -b fixed-migration before-original-migration
# *run fixed migration*
dbgit rebase --onto fixed-migration after-original-migration master
day saved!If only it were so simple.
With good updatable view support, you can maintain versions of table views and constraints on the views that may be forward, reverse or laterally migrated. (You'd have a different set of views for each version of the schema in use, so each build of the client is connecting to exactly the schema it expects.)
Then you'll want to checkpoint the base tables. So the system must track usage of view-set versions to determine when they can be dropped entirely. Since the constraints are enforced at the view level, you have to deal with legacy data that violates current constraints, but was hidden by the views.
Then the system can do a checkpoint and delete obsolete columns and tables, moving forward the earliest version you can roll back to.
Of course, that all requires good updateable view support, and I don't think any products offer that.
It would be cool to not have to be so careful when committing migrations, needing to be absolutely sure that we're not screwing something up. This is similar to how one would be careful of changes done in source code before we learned how to use version control systems like git.
I can go completely wild with git, deleting random files, overwriting others with random junk, sharing them to my coworkers, etc. It would only take a few minutes to fix that.
See https://dev.mysql.com/doc/refman/5.7/en/cannot-roll-back.htm... for an example.
We commit the migrations along-side the application code and in our case we use FlywayDB [1]. The only down-side is that this tool doesn't perform roll-back operations automatically. You can always do them by writing another migration that goes forward to the past.
Another popular DB migration tool is Liquibase [2]. I don't have much experience with this tool as it doesn't fit our build pipe-line as well but it does support and encourage defining a roll-back for each migration.
[0] https://www.martinfowler.com/articles/evodb.html
EDIT: HN is the new StackOverflow? I think this is a really important question for development teams and yet I could see it being closed due to the "Questions asking for tool or library recommendations are off-topic for Stack Overflow ..." rule. Sad!
I've done that too, and agree re: rollbacks... I've also never been a huge fan of the fact that the final, in-production server isn't ever really reified as a source file. What you have instead is the initial state and a bunch of migration files. It works, and it's the most reflective way of how the final schema got to that state, but it's not the final state. I guess that's the trade off of the relatively unsophisticated approach taken by tools like Flyway.
It tracks changes that have been applied in changelog tables so you don't apply them multiple times.
Common operations such as adding a column are defined in a supported markup language but for more complicated things, such as migrating data, you can reference ad-hoc SQL files. All of which can be checked-in to your codebase.
The only real downside is because you are applying the entire development history of the database, you can sometimes be caught doing illogical things like adding and then removing something later on. This can be mitigated by rewriting history if needed though.
Flyway was not as good for me because it lacked a domain language, raw SQL makes it less easy to interpret but I know developers who preferred that.
Could you clarify what you mean by this?
I was doing this method back in 2006, let me assure you that the automated tools available today make this look like amateur hour.
Automatic schema transactions, automatic up/down scripts, automatic detection that a schema change hasn't been committed. Automatic change tracking table in the db. Ability to seed a database with test data.
With a single command I can go back to any arbitrary point of changes, switch branch, work on a bug fix, switch back to dev and the next time I fire up the Dev site it'll automatically redeploy the new changes.
Just streets ahead of hand coded scripts. Yes, there's a learning curve, yes you have to pay attention to your design still, but it is so good.
I was skeptical at first, but honestly I'm never going back and I don't miss it. I'm using Entity Framework Migrations, I assume other languages have similar/better tools.
We're a C# shop. Our DB migration scripts are simply named with a datestamp and a change description. The scripts are added to a console exe project as embedded resources and everything is committed to source control. These exe projects are very thin and contain a single line of C# code that passes the command line arguments directly into an in-house DB deployment library which is installed via nuget. This library handles command line parsing and executing dbup in various ways, using the scripts embedded in the calling assembly.
The result is a simple executable file that, depending on its command line, can upgrade the DB directly, preview which scripts would run against a particular DB instance, test the DB deployment against a clean SQL Server LocalDB instance, or generate SQL scripts for DBAs.
One nice feature is that the exe can also return the connection string of the SQL Server LocalDB test instance to external code as a test fixture. We can use this to directly unit test our repository types against a real, clean database. When the test fixture is disposed by XUnit, the temporary DB is cleaned up and removed.
The console projects are built and tested as part of our CI builds, then pushed to Octopus Deploy[2] as packages. The Octopus deployment process simply runs the executable and passes it the connection string of the DB to update.
[1] https://dbup.readthedocs.io/en/latest/philosophy-behind-dbup...
Aren't you missing some steps? Or does your db back up schema structure and data separately?
1. Make sure nobody can insert/update/delete anything
2. Take a backup
3. Run your up migration
4. Thoroughly test everything
5. Allow insert/update/delete againI use https://fluentmigrator.github.io/ and also used dbup..
I generally like the up/down better, and both the up and down is remarkably trivial to write generally, the downs are useful during development you might change your mind about the DB structure. Never used a down in production.
To reduce the chance of error we don’t destroy columns or tables.
Our application then has an update step which runs on startup for any data migrations (or new data additions), and then updates a version number stored in the DB. The data migrations are super rare.
This all means we can migrate from any past version to our latest one: because we have all previous schemas stored in git and n update functions in our app that can walk the versions
It’s worked reliably for over a decade and is pretty much entirely pain free
[1] https://www.red-gate.com/products/sql-development/sql-compar...
Migrate in a backwards compatible manber, so that version N of the app works with N+1 schema (eg add a column, but don't destroy existing ones, use triggers to keep data aligned). When all nodes for an app are are at N+1, you can make a new version with destructive changes (that would break N but not N+1). There's a Fowler article about this.
Background:
* Almost everything is self-service by necessity. Except for some high-blast-radius cases, dev teams are able to manage their own schemas without needing MySQL team intervention. This is made possible by having automation that has appropriate safeties built in.
* There's a repository (git, hg, whatever) storing schemas. It has a couple levels of subdirectories to organize different database tiers and individual databases. In each of the bottom-level subdirs, there are text files containing CREATE TABLE statements, one file per table. In other words, this is a declarative repo, modeling the ideal state of tables in each database.
Process to add or change a table:
1. Just add or change a CREATE TABLE statement, and commit in SCM.
2. Submit a diff (pull request). Someone on your team reviews it, same as a code review.
3. Once merged, the schema change can be kicked off. (A few years ago, a dev would need to run a simple CLI command to tell the automation "please begin working on this table", but I believe this has been automated away since then.)
The tooling automatically manages running the correct DDL safely, on the correct machine(s), even in the case of a large sharded table. Devs never need to write ALTER TABLE statements; everything is just based on CREATE TABLE.
There was a separate flow (with extra steps, on purpose) for destructive actions like dropping tables or columns.
My current company uses mongodb, so migrations aren't a thing. It's pretty nice, TBH.
So, when changing columns do the type change and name change in separate migrations and you won't have to do it by hand.
Just out of curiosity, has your team used Alembic's branching system with any success?
By default EF will throw an error if the model (in code) and the database are out of sync. There is a setting you can set to tell the ef not to care about the version, but then you have to take the responsibility of making sure the old code will run against the new db version.
Most of our stuff is adding a new column or table so the old code doesn't rely on it and everything goes fine anyway.
But the development experience, especially at the start of the project (if you don't maintain your seeding code), is awesome. You just point the project at an empty or non-existing database, the schema is automatically created, and any necessary base data can automatically be seeded.
A lot of the implementations look like this:
create a migrations directory; add an initial migration script in it; make a migrate command to execute before service starts but after the backup.
The migrate command recipe: create a migrations table in the db if it doesn't exist, otherwise fetch the list of migrations that have been applied inside this database; then, apply the initial migration script if it's name is not found in the said migrations table, and insert its name in the migrations table so that it will not be executed again in this database by the migrate command.
Flyway is okay in my experience. Can't complain about it, but I can't praise it either, it just does what you'd expect.
For example, if you accidentally put a 'default' on a column when you add it to postgres, it will lock the entire table while it rewrites every row, inserting that default value.
Another common postgres blunder is creating indexes on big tables without using 'concurrently'. This also locks the table for writing and you'll have a bad time.
It's one of the nice things about Postgres, they're always improving. A few years ago, concurrent index creation wasn't a thing either. Nor were "CREATE/DROP IF (NOT) EXIST" statements for various bits and pieces, but they just keep adding to it over time.
In the upcoming PG12 there is REINDEX CONCURRENTLY, so we'll finally be able to fix corrupt/bloated indexes on the fly without having to drop and recreate them.
$ npm i migrate
If anything, I'm a lot less likely to use something written in JS when correctness is absolutely paramount.
We have a tenanted architecture, so we need to run the same migrations on lots of different databases of wildly different sizes and data complexity. We test migrations by restoring a snapshot of a sample of customers and running the migration through, and some smoke tests before and after that can be compared.
More recently, migrations are becoming a release bottleneck. That is, they take too long to run within a downtime window (fintech, so Sunday afternoon is when releases go out). We're looking at building tooling around Percona OSC, and using triggers to maintain data invariants for data migrations until code is able to catch up.
Migrations aren't what really slows us down though. Instead, it's data volume, and the difficulty in writing efficient queries while also providing a flexible UI experience.
I think that numbered sequential migration steps are still the correct way to go. They should be written at the time the feature/change is developed, ideally by the person doing that development.
Whether they are in SQL or some sort of ORM thing probably doesn't matter that much.
I don't know how to handle branching/merging issues with this approach beyond "pay attention, be careful, talk to your colleagues about database changes you're making".
At my first company we stored the whole schema instead of the incremental steps to get there. We used SQL Delta to synchronize the real database to what we wanted.
SQL Delta is a good tool, but one time we ran into a case it couldn't handle during a big deployment at a large hospital and had to abort and roll everything back. Very embarrassing. We switched to incremental scripts after that.
SQL Delta is still a good tool if you have lots of views, user defined functions, or stored procedures. As long as they aren't needed for the migration itself, you can just synchronize all of those at the end.
It's also a great tool for actually inspecting if the result of your migration is what you expected.
I do like the idea of having easy access to the full schema text as well as the migrations (someone mentioned that Rails does this). Ideally I think this should be generated from the migrations and not checked into version control.
All hooked up into CI, as well as being available from the CLI for developers, so they can open a PR and get feedback, as well as seeing what locks are required for a given migration.
It’s always one of the first things I miss when I have to work on a non-rails codebase.
We use a staging environment, so CD deployed migrations are always run at least once before running on production.
Migrations have to be written so that currently deployed code will work after the migration. Eg, to rename a column, add+copy, deploy, then in a second migration drop the column.
This gives you full control over your migrations, while still being incredibly simple - it's just SQL!
For views, sprocs and functions, we don't use sequential migration scripts, instead opting for idempotent creation. This also means it's easy to see the evolution of views etc when looking in source control.
In the past I've used Entity Framework and Entity Framework Core migrations, and hated them both - I'll never be a fan of codegen, but aside from that they sometimes generated wrong code that had to manually adjusted, and you also quickly end up with hundreds of scripts.
I like DbUp very much.
Besides that, I tend to follow the same general principals RainforestHQ presented in 2014 https://www.rainforestqa.com/blog/2014-06-27-zero-downtime-d...
After submitting a number of bug fix and feature patches to the upstream projects I ended up writing a (IMO better) tool to apply the migration scripts (https://bitbucket.org/koalephant/mallard). We currently still rely on SchemaSync for that part, but it’s been more reliable and it’s ultimately a tool for developers who review the generated sql anyway - the tool to apply them needs to run automated on remote environments.
Our source controlled files contain the database definition:
- a dacpac file generated by SSDT for the schema
- sql scripts for seed data (which changes a lot in our case), generated by a custom CLI tool (uses bcp.exe).
We have a "dev" database that is basically the "master". We have a CLI tool to generate the database definition files from the dev database. When someone make/need changes in the database, he makes them in the dev DB then call the CLI tool to update the source files.
When publishing a version the dacpac and data script are included. Migrations are created on demand (no need to go through each version, skipping versions is common). Our migration tool create reference databases of the source and target versions, then generate a schema diff script with SSDT and a data diff script with ApexSQL Data Compare. We can review/editthe migrations scripts before applying migrations.
It works well enough that we run automated migrations at night (with backup before/restore after in case of issue).
And I hate to tell you but despite what you’ll read on HN every single company I’ve seen and every single vendor I’ve dealt with - is using scripts.
Just simple scripts, in source control, amended by hand, lord hopes applied as part of a CI system, often without any kind of version numbering (so once it’s in the database you don’t know where it came from) and with no rollback.
Declarative migrations do exist. If it’s in .NET then it’s invisible to me, but I imagine it’s pretty rare. Using SSDT (Microsoft) for tooling is possible but has so many edge cases and is poorly documented so coming up with an architecture requires the kind of expert where there may literally only be a dozen in the world, so it isn’t done outside of demos that fail and get thrown out the second you’d ever try to implement them.
Turns out Visual Studio has a 'diff' generator for both schema and data. Holy hell that worked the treat.
What slows us down and related to this topic is cross-database migrations. We have a bunch of microservices in different codebases. Sometimes we need to adjust bound contexts which move the responsibilities from services to services, we found it's much harder to migrate between different databases behind those services. It's either hard to keep track in the same repo and/or too slow to migrate large volume of data across hosts. In the end we wrote some optimized SQL to do that. Maybe monorepo could help but this doesn't happen too much, and we haven't start trying it yet.
- flyway-like go based homegrown process for Cassandra (that includes support for executing go scripts for data migrations etc)
Above work pretty well - for many, many microservices and continuous deployments all the way to prod.
We're going to research DACPACs to deploy databases. We hope it will be better.
Schema changes are relatively straightforward, there are plenty of tools that can help. Changing the database backend is a different story altogether. There are so many unknown unknowns when changing databases, that it's generally best to avoid it if you can.
It does NOT integrate with your server framework so you need to figure an ORM solution out, independent of Sqitch.
This place we do it manually but because of how the app is architected migrations are rare. If we migrate we try to make the code compatible with before and after schemes.
$ python manage migrate
SQL migration management with in-database version tracking.
Super simple, but works for us.
You basically write SQL queries for up and down.
I have a system called timequerylog and a tool tql-cli where data is just appended to JSONL files organized by key, date, and time.
----
In one project, we use DBGhost (for MS SQL Server). It's like RedGate SQL Compare, but in deployable, self-contained executable form. It does a complete schema sync, so internally we run on every build, and externally can upgrade from every previous release, without the pesky "in between" evolution you tend to get with ordered migration scripts. It's run as part of our upgrade process for every app version, and our deploy package is built from the 'initial database create' SQL script in source control.
To make a schema change such as adding a new column, you modify the relevant `CREATE TABLE` script, commit it to source, and that's it.
We also use a hand-built pre- and post-deploy script to do anything tricky that can't be automated (renaming a column, copying/converting data from one column to another, etc). Importantly, these scripts are idempotent (eg: `if (old_column_exists) { start transaction; create new column; copy+transform data; drop old_column; commit transaction; }`). We generally avoid major changes like this as much as we can, but it's possible when necessary. We also have DBGhost configured not to drop anything (to avoid mistakes, or dropping customer customization that they do even though our support contract explicitly says not to), and instead write those by hand.
This process has been in many dozens of customer-facing releases -- including on-premise customers -- for several years, and 'just works'. DBGhost is a clunky app, but our interaction with it is now entirely scripted (via either build or deployment scripts), and the compare engine part of it is actually very good.
----
In another pretty simple app (that runs on our cloud infrastructure only, with 'production' deployments mirrored in a couple different regions, and usually a single 'dev' deployment though sometimes there are other temporary ones) we opted to only use an idempotent, hand-written script, executed as part of deployment. It has `create table if not exists` statements to setup from scratch, and it also has the transforms for changes done over time (`create index if not exists;` `drop column if exists` etc). We periodically remove the transform statements to clean it up after all deployments have been updated past that point.
Even though it's manual, it's actually quite easy to maintain, so long as you're careful about ensuring it's idempotent. The nice part is it typically gets deployed to the dev infrastructure multiple times during the course of working on something, so if there's a problem it becomes obvious very quickly when the deployment fails.
----
There's also another app which uses the more traditional ordered migrations files, which I liked at first but over time I find it annoying. Deploying from scratch installs a really old initial database schema, then proceeds to apply dozens of modifications, many of which overwrite previous modifications.
----
I've also worked on an ordered migrations file app where there was a separate 'create' step for the initial creation (instead of running through each migration). The first time I deployed from scratch I found a bunch of problems where the create script wasn't 100% synchronized with the migrations, but also was synchronized enough that it wasn't possible to 'fix' by running each migration (one of the early ones failed). The only fix was to go through by hand and manually run the relevant migration files. I'm sure there can be better practices to help prevent this (eg some compare step on build), but this happened years ago and still sticks with me as a terrible way to do migrations.