TiDB Lightning Overview
TiDB Lightning is a tool used for importing data at TB scale to TiDB clusters. It is often used for initial data import to TiDB clusters.
TiDB Lightning supports the following file formats:
- Files exported by Dumpling
- CSV files
- Apache Parquet files generated by Amazon Aurora
TiDB Lightning can read data from the following sources:
TiDB Lightning architecture
TiDB Lightning supports two import modes, configured by backend
. The import mode determines the way data is imported into TiDB.
Physical Import Mode: TiDB Lightning first encodes data into key-value pairs and stores them in a local temporary directory, then uploads these key-value pairs to each TiKV node, and finally calls the TiKV Ingest interface to insert data into TiKV's RocksDB. If you need to perform initial import, consider physical import mode, which has higher import speed.
Logical Import Mode: TiDB Lightning first encodes the data into SQL statements and then runs these SQL statements directly for data import. If the cluster to be imported is in production, or if the target table to be imported already contains data, use logical import mode.
Import mode | Physical Import Mode | Logical Import Mode |
---|---|---|
Speed | Fast (100~500 GiB/hour) | Low (10~50 GiB/hour) |
Resource consumption | High | Low |
Network bandwidth consumption | High | Low |
ACID compliance during import | No | Yes |
Target tables | Must be empty | Can contain data |
TiDB cluster version | >= 4.0.0 | All |
Whether the TiDB cluster can provide service during import | No | Yes |