mysql数据量大导出怎么分割 mysql数据量太大如何导出

线上数据是共享表空间,正好因为服务器特性导致不能扩容,索性迁移的时候就把共享表空间改为单独表空间,方便后期拓展。线上数据1.6T,导出的SQL文件约745G。第一次使用官方工具mysqldump导出,导出时间消费5.5小时,再次导入的时候就蛋痛了,因为是单线程的方式,导入时间远远大于预期时间,可能要达到四五天的样子,接受不了,耗时周期长,中间出问题不好续导。

一、线上数据导出
1、下载安装mydumper工具

# yum install glib2-devel mysql-devel zlib-devel pcre-devel openssl-devel
# wget https://github.com/maxbube/mydumper/releases/download/v0.9.3/mydumper-0.9.3-41.el6.x86_64.rpm
# yum install ./mydumper-0.9.3-41.el6.x86_64.rpm

2、线上数据导出(切记按行数或者文件大小导出)

# 为了充分发挥导入多线程的功能,可以按文件大小导出,我是采用512M, 实际导可以设置更小一些
# mydumper -h xxx    -u xxx -p xxx --less-locking --threads 8 -F 256 --triggers --events  --routines -v 3 --outputdir /data/bak/mysql/1225 >> /tmp/restorelog2 2>&1

3、参数说明

# mydumper --help
Usage:
  mydumper [OPTION...] multi-threaded MySQL dumping

Help Options:
  -?, --help                  Show help options

Application Options:
  -B, --database              Database to dump
  -T, --tables-list           Comma delimited table list to dump (does not exclude regex option)
  -o, --outputdir             Directory to output files to
  -s, --statement-size        Attempted size of INSERT statement in bytes, default 1000000
  -r, --rows                  Try to split tables into chunks of this many rows. This option turns off --chunk-filesize
  -F, --chunk-filesize        Split tables into chunks of this output file size. This value is in MB
  -c, --compress              Compress output files
  -e, --build-empty-files     Build dump files even if no data available from table
  -x, --regex                 Regular expression for 'db.table' matching
  -i, --ignore-engines        Comma delimited list of storage engines to ignore
  -m, --no-schemas            Do not dump table schemas with the data
  -d, --no-data               Do not dump table data
  -G, --triggers              Dump triggers
  -E, --events                Dump events
  -R, --routines              Dump stored procedures and functions
  -W, --no-views              Do not dump VIEWs
  -k, --no-locks              Do not execute the temporary shared read lock.  WARNING: This will cause inconsistent backups
  --no-backup-locks           Do not use Percona backup locks
  --less-locking              Minimize locking time on InnoDB tables.
  -l, --long-query-guard      Set long query timer in seconds, default 60
  -K, --kill-long-queries     Kill long running queries (instead of aborting)
  -D, --daemon                Enable daemon mode
  -I, --snapshot-interval     Interval between each dump snapshot (in minutes), requires --daemon, default 60
  -L, --logfile               Log file name to use, by default stdout is used
  --tz-utc                    SET TIME_ZONE='+00:00' at top of dump to allow dumping of TIMESTAMP data when a server has data in different time zones or data is being moved between servers with different time zones, defaults to on use --skip-tz-utc to disable.
  --skip-tz-utc               
  --use-savepoints            Use savepoints to reduce metadata locking issues, needs SUPER privilege
  --success-on-1146           Not increment error count and Warning instead of Critical in case of table doesn't exist
  --lock-all-tables           Use LOCK TABLE for all, instead of FTWRL
  -U, --updated-since         Use Update_time to dump only tables updated in the last U days
  --trx-consistency-only      Transactional consistency only
  --complete-insert           Use complete INSERT statements that include column names
  -h, --host                  The host to connect to
  -u, --user                  Username with the necessary privileges
  -p, --password              User password
  -P, --port                  TCP/IP port to connect to
  -S, --socket                UNIX domain socket file to use for connection
  -t, --threads               Number of threads to use, default 4
  -C, --compress-protocol     Use compression on the MySQL connection
  -V, --version               Show the program version and exit
  -v, --verbose               Verbosity of output, 0 = silent, 1 = errors, 2 = warnings, 3 = info, default 2
  --defaults-file             Use a specific defaults file

4、导出文件类型

metadata :包含导出开始和结束时间,如果开启binlog会记录日志位置信息;如果启用gtid ,则记录gtid信息。
db.table.sql :数据文件,insert语句
db.table-schema.sql :包含建表语句
db-schema.sql :包含建库语句

5、如何获取一致性快照的

- server上正在运行的慢查询或中断dump,或者慢查询被kill掉
- 需要施加全局写锁("flush tables with read lock")
- 读取不同的元数据("show slave status","show master status")
- 具有事务性和非事务性表一致的快照(0.2.2+)
- 一旦所有的工作现成通知已经创建好快照,master会执行"unlock tables",开始运行队列中的jo

6、导出工作原理

7、mydumper主要导出过程步骤

- 主线程 flush tables with read lock, 施加全局只读锁,以阻止dml语句写入,保证数据的一致性
- 读取当前时间点的二进制日志文件名和日志写入的位置并记录在metadata文件中,以供恢复使用
- start transaction with consistent snapshot; 开启读一致事务
- 启用n个(线程数可以指定,默认是4)dump线程导出表和表结构
- 备份非事务类型的表
- 主线程 unlock tables,备份完成非事务类型的表之后,释放全局只读锁
- 基于事务dump innodb tables
- 事务结束

二、线上数据导入
1、导入之前,先对mysql做常规的几个参数做监控,便于了解当前导入情况

# mysql -u xxx -p -e "show status"|grep Com_insert         # insert  频率
# mysql -u xxx -p -e "show status"|grep Com_update      # update 频率
# mysql -u xxx -p -e "show status"|grep Bytes_sent         #  mysql 出流量
# mysql -u xxx -p -e "show status"|grep Bytes_received  #  mysql 入流量
# mysql -u xxx -p -e "show status"|grep Innodb_rows_inserted        # innodb 插入行数
# mysql -u xxx -p -e "show status"|grep Innodb_rows_updated       # innodb 更新行数

2、为了能够快速导入,导入之前,先调整几个mysql参数,我环境是MySQL5.5.45,配置32C128G

# cat /etc/my.cnf
...
innodb_buffer_pool_size = 80960M
innodb_file_per_table = 1

innodb_flush_log_at_trx_commit=0
interactive_timeout = 120
wait_timeout = 600                        # 这个不设置,多线程导入可能就中途失败了
max_allowed_packet = 2048M     # 这个不设置,多线程导入可能就中途失败了

innodb_read_io_threads = 6
innodb_write_io_threads = 24
innodb_purge_threads=6
innodb_autoextend_increment= 128M
bulk_insert_buffer_size=512M
...
mysql
> SET global autocommit=0;
> SET global unique_checks=0;
> SET global foreign_key_checks=0;

3、开始数据导入

myloader和mydumper是配套使用的,安装了mydumper之后就有myloader命令。导入尽量加到脚本里面去执行,看开始和结束时间,另外一定要加上**-v 3 参数**,可以看到整个执行过程,后续中间如果断点,还能手动给继续上。

#  myloader   -h 127.0.0.1 -u xxx -p xxx --threads 24 -q  30000 -v 3  --directory /data/bak/mysql/1225/ >> $log 2>&1

4、myloader导入参数

# myloader --help
Usage:
  myloader [OPTION...] multi-threaded MySQL loader

Help Options:
  -?, --help                        Show help options

Application Options:
  -d, --directory                   Directory of the dump to import
  -q, --queries-per-transaction     Number of queries per transaction, default 1000
  -o, --overwrite-tables            Drop tables if they already exist
  -B, --database                    An alternative database to restore into
  -s, --source-db                   Database to restore
  -e, --enable-binlog               Enable binary logging of the restore data
  -h, --host                        The host to connect to
  -u, --user                        Username with the necessary privileges
  -p, --password                    User password
  -P, --port                        TCP/IP port to connect to
  -S, --socket                      UNIX domain socket file to use for connection
  -t, --threads                     Number of threads to use, default 4
  -C, --compress-protocol           Use compression on the MySQL connection
  -V, --version                     Show the program version and exit
  -v, --verbose                     Verbosity of output, 0 = silent, 1 = errors, 2 = warnings, 3 = info, default 2
  --defaults-file                   Use a specific defaults file

5、线上数据导入时间参考(共花费不到两天,生成数据文件约1500G)
1)每小时生成的文件大小

2019-12-26 10-01 636G /data/mysql
2019-12-26 11-01 675G /data/mysql
2019-12-26 12-01 710G /data/mysql
2019-12-26 13-01 739G /data/mysql
2019-12-26 14-01 769G /data/mysql
2019-12-26 15-01 796G /data/mysql
2019-12-26 16-01 826G /data/mysql
2019-12-26 17-01 852G /data/mysql
2019-12-26 18-01 880G /data/mysql
2019-12-26 19-01 916G /data/mysql
2019-12-26 20-01 946G /data/mysql
2019-12-26 21-01 972G /data/mysql
2019-12-26 22-01 1003G /data/mysql
2019-12-26 23-01 1.1T /data/mysql
2019-12-27 00-01 1064G /data/mysql
2019-12-27 01-01 1092G /data/mysql
2019-12-27 02-01 1116G /data/mysql
2019-12-27 03-01 1145G /data/mysql
2019-12-27 04-01 1173G /data/mysql
2019-12-27 05-01 1196G /data/mysql
2019-12-27 06-01 1223G /data/mysql
2019-12-27 07-01 1247G /data/mysql
2019-12-27 08-01 1272G /data/mysql
2019-12-27 09-01 1300G /data/mysql
2019-12-27 10-01 1328G /data/mysql
2019-12-27 11-01 1352G /data/mysql
2019-12-27 12-01 1376G /data/mysql
2019-12-27 13-01 1399G /data/mysql
2019-12-27 14-01 1423G /data/mysql
2019-12-27 15-01 1445G /data/mysql
2019-12-27 16-01 1466G /data/mysql
2019-12-27 17-01 1489G /data/mysql
2019-12-27 18-01 1505G /data/mysql
2019-12-27 19-01 1505G /data/mysql
2019-12-27 20-01 1511G /data/mysql
2019-12-27 21-01 1511G /data/mysql
2019-12-27 22-01 1511G /data/mysql

导入期间,mysql状态图

扫码领红包

微信赞赏支付宝扫码领红包

发表回复

后才能评论