- The ability to control a running migration is crucial. We have pretty predictable load, and we generally run long-running migrations during off-peak hours. If a migration runs longer than we were expecting and might run into peak hours, we can pause the migration and have the migration not impact users.
- hooks make it trivial to integrate with other tools. Right now it reports to slack, but if we used it more, we'd likely hook it up to real monitoring infrastructure.
- there's a lot of default behavior that we want. I'd recommend regular users wrap their best practices in another script and not call gh-ost directly. It's nice to not worry about good defaults for e.g. throttling, or worrying about whether ghost is hooked up to some kind of external monitoring.
The first point depends on the mechanism used to keep up with changes to the original table. You can’t fully pause migrations on pt-online-schema-change for example, as it leverages triggers for that part.
From my phone so sorry if too brief, gh-ost’s docs are great and would tell the whole story.
I understand that at GitHub's scale, foreign keys might be more of a hassle than what they are worth, but for a smallish company that values data integrity over scale and uptime, this is not an acceptable choice.
It is true that it is not on our roadmap to implement FK support for gh-ost (see https://github.com/github/gh-ost/issues/331), but if anyone wishes to contribute support for FK we're grateful. We've had more complex contributions coming from the community and we're grateful for those.
FK checks also affect performance, of course. Where I work, we disable FKs on our bulk inserts but keep them enabled otherwise and also in tests; but our workload is different from the usual consumer web app, we have multi-million row inserts per user, and no more than 100 users or so per customer, who each get their own tenant DB.
The usual routes for data consistency between tables are batch clean up or in-app validation.
At a past job where we had a complex MySQL setup, I set up a slack autoresponse to post "Just say no!" anytime someone mentioned foreign keys. :-)
They're also a performance impact on large tables since inserts/deletes must make multiple trips to the tables/indexes. That's a growing operational hassle as tables grow larger.
It was recently open-sourced. Do take a look.
For the reader here: one thing to clarify here is that gh-ost performs schema migration via a data migration between two different tables and it does it via a very efficient way. Ghostferry on the other hand is general purpose data migration library that moves data between different databases, most likely different hosts. Frequently, both schema migration and data migrations are abbreviated as migrations and thus may cause some confusion. The domain of operation of Ghostferry do not necessarily overlap with gh-ost, as it would be very inefficient to use Ghostferry to implement gh-ost.
That said it is a very interesting project on its own as it has a lot of potential use cases. I don't want to hijack the thread any further than I already have so if anyone has any further questions, you can contact information and docs in the repo.
Sounds like this tool is doing something similar but avoiding the use of triggers for flexibility.
We dual wrote to both DBs while we copied the existing data to the new DB, then switched them over. I think we had less than 5 minutes of downtime all up.
Flynn.io uses the same kind of strategy; transaction log && async replication (https://flynn.io/docs/databases)
A little sad nanobox.io which one of my app running on has an inferior strategy; temporarily offline at the last sync moment (https://docs.nanobox.io/data-management/data-migrations-scal...)
For this reason alone I think we'll continue to use schema-change tools on MySQL even if the server itself becomes better at those.
In the specific case of gh-ost, another good point is that migrations can be completely paused, which in MySQL is not true of online DDL.
(And as modern MySQL releases get better online DDL support, become less and less critical - though still useful for all of those edge cases where native lockless online DDLs can't work yet)
Its pretty cool. Check it out as well.
https://www.percona.com/doc/percona-toolkit/LATEST/pt-online...
Also the linked post[3] in the readme hit us very close to home. We originally tried some of our migrations with pt-online-schema-change, which was great in theory but caused a lot of locking contention during the actual process.
I see many people hammering on the lack of foreign key support which is interesting to me. At some point, a database system grows to where relying on MySQL's Online DDL[4] "works" but not really with production load. I feel like a team knows when they need to bring in a tool like this.
The dev in me understands how wonderful FKs are for consistency. But the db-guy in me that has had to deal with locking issues recognizes FKs as a tradeoff, not dogma.
If you shy away from migrating your large or busy tables, or are scheduling frequent maintenance down times in order to migrate these tables, that's when gh-ost (and others) are appropriate to evaluate.
So for us it's not an immediate red flag that gh-ost doesn't support FKs. We just have to work around that limitation[5] because the alternatives are much worse.
For the record, we don't gh-ost all of our migrations. Only the ones that are deemed sufficiently large enough are gh-osted and those heuristics will change from team-to-team.
But as a guy who has had to deal with our database issues AND as a developer who doesn't want to be chained by a database design decision from a decade ago, I love the flexibility gh-ost gives us as we continue to grow.
[1] https://www.getharvest.com/
[2] https://dev.mysql.com/doc/refman/5.6/en/replication-features...
[3] https://dev.mysql.com/doc/refman/5.6/en/replication-features...
[4] https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-...
[5] https://github.com/github/gh-ost/issues/507#issuecomment-338...