Experimental features in TiDB
In my previous post about my top 10 feature requests, I said that there are too many experimental features in TiDB, but that is the subject for another blog post. Well, here we are today.
Let’s start with the TiDB documentation on experimental features, which states:
This document introduces the experimental features of TiDB in different versions. It is NOT recommended to use these features in the production environment.
If only that recommendation was actually viable. For example, under SQL it states that User Defined Variables have been experimental since TiDB 2.1 (November 2018). So for almost 4 years, this feature has been under development waiting to graduate to GA quality.
What is a User Defined Variable?
Here is an example:
tidb> SET @a=1; Query OK, 0 rows affected (0.00 sec) tidb> SELECT @a; +----+ | @a | +----+ | 1 | +----+ 1 row in set (0.00 sec)
Why would you use this feature? Well, you might not even realize you are already using it. For example when importing a mysqldump file, it will include the following SQL syntax with user variables:
-- MySQL dump 10.13 Distrib 8.0.27, for Linux (x86_64) -- -- Host: 127.0.0.1 Database: test -- ------------------------------------------------------ -- Server version 5.7.25-TiDB-v6.2.0-alpha-68-g7f3a72e21 /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */; /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */; /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */; /*!50503 SET NAMES utf8mb4 */; /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */; /*!40103 SET TIME_ZONE='+00:00' */; /*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */; /*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */; /*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */; /*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */; -- -- Table structure for table `t1` -- DROP TABLE IF EXISTS `t1`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!50503 SET character_set_client = utf8mb4 */; CREATE TABLE `t1` ( `a` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T![placement] PLACEMENT POLICY=`p1` */; /*!40101 SET character_set_client = @saved_cs_client */;
Various connectors and ORMs will also make use of user-defined variables. Not using this feature in production is basically unavoidable. It is almost as impossible as saying
SELECT syntax is experimental and should not be used in production.
What about other experimental features?
Let me go through some of the experimental features both listed and unlisted from the experimental features documentation page:
JSON Functions / JSON Data Type (experimental since v2.1)
JSON Functions and the JSON Data Type are experimental for the usual reason: there are some unfixed bugs, and lack of test coverage. It requires some investment to fix, but from my observation these features tend to be used less in China domestically than globally. I am not sure why this is, but I very much hope this is prioritized for the global market.
SHOW CONFIG / SET CONFIG (experimental since v4.0)
SHOW CONFIG and
SET CONFIG make sense for PD and TiKV config, but for TiDB config, this is now the wrong direction and should be deprecated and removed. The system variables framework now has instance level config, and it works a little better (it has option validation, typing, and works in a standard MySQL way).
SQL Diagnostics (experimental since v4.0)
SQL Diagnostics is a feature to provide a SQL interface on top of Prometheus, with some common diagnostic queries presented like they are tables. I’m not sure what is required to graduate this feature, but I’m a SQL user (and less so a prometheus/grafana user), so I'm a fan of this work.
Views (experimental since v2.1)
SQL Views seem to work okay in basic usage. I suspect that this feature is experimental because of minor issues, and/or because it is missing test cases. In terms of functionality: I do also know of a couple of missing optimizations, and views are currently non-updatable.
Side note: The documentation also needs work. Views are listed on the experimental features page, but on the page for SQL Views there is no mention of its experimental status. The link is also to the information schema page, not the feature description page.
Fast Analyze / Extended Statistics / Analyze V2 (experimental since v4.0/v5.0/v5.3)
I’m not sure of the current state of these experiments. I find fast analyze useful, since analyze takes a long time in TiDB. When there are multiple experiments in the same area, it often hurts usability a little as the user has to read up to understand the compatibility is between them. So from a product perspective, I hope some can graduate soon.
Local Transactions (experimental since v4.0?)
Local transactions cater for the use case that you can have a single TiDB cluster span globally, with portions of data updated locally without cross-datacenter latency. This is required because without local transactions a TiDB server needs to talk to a global PD server to start and finish a transaction to get the current timestamp (TSO).
Local transactions make sense for TiDB based on its current use of TSO, but long term I would love to see Time Applicances offered. There is unfortunately a hardware requirement here, which I am assuming that both the cost comes down, and they become available in public cloud(s).
Cascades Planner (experimental since v4.0?)
This is a new bottom up, state of the art planner for TiDB. The work was never finished and appears to be stalled for now.
Table Locks (experimental since v4.0)
An important feature for some migration cases since sometimes you want to prevent writes to tables and/or clusters. This feature (enabled via
enable-table-lock in tidb.toml) is completely undocumented. Work appears to be stalled for now, but this should really be GA, since it is a basic MySQL compatibility feature (and like user-defined-variables is also used by mysqldump).
Multi schema change (experimental since v5.0?)
This experiment allows a single
ALTER TABLE statement to contain several chained changes. The option
OFF by default and not currently documented.
While doing my research for this post, I was pleasantly surprised that there have been several recent experimental feature graduations:
- Global Kill (experimental since v4.0?, stable in v6.1)
- Raft Engine (experimental since v5.4, stable in v6.1)
- Persist config parameters in PD (experimental since v4.0, stable in v6.1)
- Online Unsafe Recovery (experimental in v5.3, stable in v6.1)
- List/List Columns Partitioning and Dynamic Pruning (experimental since v5.0/v5.1, stable in v6.1)
- Top SQL (experimental in v5.4, stable in v6.0)
- Continuous Profiling (experimental in v5.3, stable in 6.0)
- Placement Rules in SQL (experimental in v5.3, stable in v6.0)
- Automatically scale TiKV Thread Pool (experimental in v5.4, stable in v6.0)
Some of these features I really love too:
Global Kill is a useful feature since it allows for regular
KILL n syntax to work and Ctl+C in clients. Without this, TIDB requires the proprietary
KILL TIDB n syntax to prevent the case that the wrong connection is killed because tidb might be behind a load balancer. A welcome addition!
Raft Engine switches from using RocksDB to a use-case specific engine to store raft logs in TiKV. This reduces the IO required to log changes, and improves TiKV performance.
Placement Rules in SQL allows you to specify rules for how data is stored in standard SQL. For example, you can add additional constraints such as regions or the number of copies (note: this is a feature I worked on.)
So the basic process of features being introduced as experimental before later graduating is working. There are just some features that have been stuck in experimental state.
Despite the tone of this post, I am actually a proponent of shipping experimental features. It allows engineering teams to be more productive (fewer merge conflicts), and collect feedback so they can change behaviors before going GA (after which, it's very difficult to do). My issue is specifically that the rules need to be strict.
What do I think should be done?
I hope this isn't too controversial, but: if a feature has been experimental since TiDB v2.1 and has not managed to graduate, it should be deprecated and removed from the source code. If that’s not viable to do, then Engineering/QA resources should be applied today to make sure it can be made GA.
The requirement to graduate or be removed should be a pre-condition for allowing experimental features to be merged into the main branch. MySQL, for example, doesn't typically allow for experimental features to be merged.
In future, it should also be possible to disable or enable all experiments on a TiDB Cluster with one SQL command. The current warning to avoid using in production honestly makes me a little angry, since it puts too much of the burden on users. Vendors have a responsibility to make sure that if they are going to ship stable and unstable features together, the unstable ones can actually be avoided.