The Top 5 reasons to use TiDB
Where I work, it's up to the product owners to select which database technology they would like to use. The default choice is Aurora (MySQL 5.7), but for large databases we are starting to offer TiDB.
But being able to create larger databases is only one benefit to TiDB. I've also found that my answer on what the benefits are substantially differs from the key features documentation. So today, I will share with you my top 5!
1. Horizontal Scalability with automatic hotspot balancing
I think it is pretty well understood that TiDB scales horizontally. What is less well understood, is that it does a great job at balancing load across the cluster. This helps to reduce the hotspot problem that usually occurs in sharded systems.
I have previously written about this in more detail in my post: This is the most underrated feature in TiDB.
2. Completely transparent sharding without any limits on cross-shard transactions
Proxy-based or application managed sharding solutions usually have strong limitations on cross-shard transactions:
- For read-queries they are often limited by not having database statistics, so they can only effectively plan simple queries.
- For write-queries they typically try to avoid cross-shard transactions, since it requires a 2-phase commit to ensure consistency (and 2PC is not well supported by MySQL).
With TiDB there really aren't any limitations on either reads or writes across shards. That's not to say there won't be performance optimizations that can be added later, but from a functional perspective it's completely transparent.
This makes TiDB a great retrofit for existing applications since very little code needs to be changed.
3. Foreign Key Constraints (coming soon!)
I am calling this one a little bit early, but there is a draft pull request to add Foreign Key constraints to TiDB. This was my #1 feature request from an earlier blog post, and I am very excited to see it being worked on.
Having foreign keys ties nicely with the previous point about having completely transparent sharding: you can have a single global view of the data, and set constraints on it.
MySQL has foreign keys too, but we are unable to use them because they are not supported by online schema change tools. This relates to the next point, where the built-in DDL in TiDB is actually quite good.
4. A simplified DDL story
For MySQL (and Aurora) we need to use online-schema-change tools such as
gh-ost. Gh-ost works slightly better since it is triggerless, but it requires the binary log enabled which the Aurora engineers really try to encourage you not to do.
Some DDL changes in MySQL are now instant in 8.0, such as adding or removing a column, dropping an index, or renaming a column. But some of the really important ones, such as change data type are still blocking. There is also a category of schema changes which are non-instant, but not blocking either (aka online). I've found that the online changes are actually not online for the last phase of the schema change, which makes them actually quite risky in practice.
In TiDB, most schema changes are instant. The ones that are not instant but online are not risky, and appear to be safe to run in production. That doesn't mean that I don't have a few gripes with TiDB DDL. My top 3 issues are:
- Only a single node in the cluster (the DDL owner) will ever be used to perform DDL.
- Only a single DDL operation can be performed at a time (concurrent DDL is experimental).
- The DDL process is too slow (a new fast import is experimental in v6.2)
On top of this, not all DDL changes are supported. For example there is no way to add/drop a primary key, or add partitioning to an existing table.
But overall it's still a welcome change to ditch the schema change tools and use the native DDL supported by the database. And as I said above: FOREIGN KEYS!
5. Easier upgrades
In the MySQL world, once upon a time we were used to receiving a new GA release every 2-3 years and then having it supported for the next 8 years. Those times have changed, with MySQL 8.0 now using a rapid 3 month development cycle.
Every quarter there are new features, and unfortunately a potential for new regression bugs. Several times xtrabackup has broken on 8.0.x releases, and since the point releases also contain vital security fixes it leaves database admins in an awkward position.
With Aurora 8.0 committing to patch release upgrades, we don't get out of this upgrade-pain just by consuming our database from Amazon. Once we get to 8.0 (we are still on 5.7) we will have to expect to spend significantly more time on upgrades, which I am not excited about.
With TiDB the upgrade cycle is still a little annoying: there is a new LTS version every 6 months. But the real advantage is that upgrades are safer, since it's usually possible to use a canary to upgrade one tikv or tidb node before rolling out to the rest of the cluster. This is safer than Aurora, where (since we don't use read-replicas) it's not really easy to test with a small percentage of traffic before rolling out completely.
One more nit about MySQL 8.0: minor version downgrade is not supported. This is not technically supported by TiDB either, but since the requirements for a downgrade are the same as a rolling upgrade (i.e. you need 2 versions to run at once), it is tested and usually works.
 It only just switched to every 6 months, previously it was every 2-3 months. I would prefer annual. I can upgrade to every second LTS release, but skipping releases makes each upgrade a little more risky.
So there you have it: my top 5 reasons to use TiDB. If you haven't seen it yet, please also check out my top 10 feature requests for TiDB. I am also happy to report that my top 3 requests are now in progress, but that's a subject for another post!