1. UnDROP Tool For InnoDB

UnDROP Tool For InnoDB

UnDROP Tool For InnoDB

The TwinDB data recovery toolkit is a set of tools that work with InnoDB tablespaces at low level.

Incredible Performance Of Stream_parser

stream_parser is a tool that finds InnoDB pages in a stream of bytes. It can be either a file such as ibdata1, *.ibd or a raw partition.
stream_parser runs as many parallel workers as the number of CPUs in the system.  The performance of stream_parser is amazing! Compare how stream_parser outperforms page_parser on a four-CPU virtual machine running on my laptop:

# ./page_parser -f /dev/mapper/vg_twindbdev-lv_root -t 18G
Opening file: /dev/mapper/vg_twindbdev-lv_root
...
Size to process:               19327352832 (18.000 GiB)
1.00% done. 2014-06-23 03:03:48 ETA(in 00:18 hours). Processing speed: 17570320 B/sec
2.00% done. 2014-06-23 03:05:27 ETA(in 00:19 hours). Processing speed: 16106127 B/sec
3.00% done. 2014-06-23 03:02:11 ETA(in 00:16 hours). Processing speed: 19327352 B/sec
4.00% done. 2014-06-23 03:03:48 ETA(in 00:17 hours). Processing speed: 17570320 B/sec
...

As you see, it takes almost 20 minutes to parse 18G partition.

Let’s check stream_parser

# ./stream_parser -f /dev/mapper/vg_twindbdev-lv_root -t 18G
...
Size to process:               19327352832 (18.000 GiB)
Worker(0): 1.91% done. 2014-06-23 02:51:41 ETA(in 00:00:56). Processing speed: 79.906 MiB/sec
Worker(2): 1.74% done. 2014-06-23 02:51:47 ETA(in 00:01:02). Processing speed: 72.000 MiB/sec
Worker(3): 3.30% done. 2014-06-23 02:51:15 ETA(in 00:00:30). Processing speed: 144.000 MiB/sec
Worker(1): 1.21% done. 2014-06-23 02:52:20 ETA(in 00:01:35). Processing speed: 47.906 MiB/sec
Worker(2): 5.38% done. 2014-06-23 02:51:11 ETA(in 00:00:25). Processing speed: 168.000 MiB/sec
Worker(3): 9.72% done. 2014-06-23 02:51:00 ETA(in 00:00:14). Processing speed: 296.000 MiB/sec
...
Worker(0): 88.91% done. 2014-06-23 02:52:06 ETA(in 00:00:02). Processing speed: 191.625 MiB/sec
Worker(0): 93.42% done. 2014-06-23 02:52:06 ETA(in 00:00:01). Processing speed: 207.644 MiB/sec
Worker(0): 97.40% done. 2014-06-23 02:52:06 ETA(in 00:00:00). Processing speed: 183.641 MiB/sec
All workers finished in 31 sec

So, 18 minutes versus 31 seconds. 34 times faster! Impressive, isn’t it?

C_parser Improvements

c_parser is a tool that reads an InnoDB page or many pages, extracts records and stores them in tab-separated values dumps. However, an InnoDB page with user data doesn’t store information about table structure. You should tell c_parser what fields you’re looking for. Furthermore, the command line option -t specifies a file with CREATE TABLE statement.

For example, this is how it works. Here’s the CREATE statement (I took it from mysqldump)

# cat sakila/actor.sql
CREATE TABLE `actor` (
  `actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `first_name` varchar(45) NOT NULL,
  `last_name` varchar(45) NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`actor_id`),
  KEY `idx_actor_last_name` (`last_name`)
) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8;

And now, let’s fetch records of the table actor from InnoDB pages:

# ./c_parser -6f pages-actor.ibd/FIL_PAGE_INDEX/0000000000001828.page -t sakila/actor.sql
-- Page id: 3, Format: COMPACT, Records list: Valid, Expected records: (200 200)
000000005313    970000013C0110  actor   1       "PENELOPE"      "GUINESS"       "2006-02-15 04:34:33"
000000005313    970000013C011B  actor   2       "NICK"  "WAHLBERG"      "2006-02-15 04:34:33"
000000005313    970000013C0126  actor   3       "ED"    "CHASE" "2006-02-15 04:34:33"
...
000000005313    970000013C09D8  actor   199     "JULIA" "FAWCETT"       "2006-02-15 04:34:33"
000000005313    970000013C09E4  actor   200     "THORA" "TEMPLE"        "2006-02-15 04:34:33"
-- Page id: 3, Found records: 200, Lost records: NO, Leaf page: YES

MySQL 5.6 introduced few format changes. However, most of them were already supported. On top of that, the c_parser fixes some bugs in the processing temporal fields.

The new UnDROP tool for InnoDB is still no reason not to take backups :-), but at least you can be better  armed if the inevitable happens.

How to Recover Table Structure

Normally, MySQL stores table structure in a respective .frm file. When the table is dropped, the .frm file is gone. Fortunately, InnoDB stores a copy of the structure in the dictionary. Sys_parser is a tool that can read the dictionary and generate a CREATE TABLE statement. Check how you can Recover Table Structure From the InnoDB Dictionary.

How to Install the TwinDB Data Recovery Toolkit

Check out the source code from GitHub:

$ git clone git@github.com:twindb/undrop-for-innodb.git
Cloning into 'undrop-for-innodb'...
remote: Counting objects: 227, done.
remote: Total 227 (delta 0), reused 0 (delta 0), pack-reused 227
Receiving objects: 100% (227/227), 1.10 MiB | 1.13 MiB/s, done.
Resolving deltas: 100% (57/57), done.

Next, you can compile the source code. But first, don’t forget to install dependencies: make, gcc, flex, bison.

# make
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c stream_parser.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include  -pthread -lm  stream_parser.o -o stream_parser
flex  sql_parser.l
bison  -o sql_parser.c sql_parser.y
sql_parser.y: conflicts: 6 shift/reduce
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c sql_parser.c
lex.yy.c:3078: warning: ‘yyunput’ defined but not used
lex.yy.c:3119: warning: ‘input’ defined but not used
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c c_parser.c
./include/ctype-latin1.c:359: warning: ‘my_mb_wc_latin1’ defined but not used
./include/ctype-latin1.c:372: warning: ‘my_wc_mb_latin1’ defined but not used
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c tables_dict.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c print_data.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c check_data.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe   -I./include  sql_parser.o c_parser.o tables_dict.o print_data.o check_data.o -o c_parser -pthread -lm
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe   -I./include -o innochecksum_changer innochecksum.c
[root@twindb-dev undrop-for-innodb]#

UPDATE:

The toolkit is tested on following systems:

  1. CentOS release 5.10 (Final) x86_64
  2. CentOS release 6.5 (Final) x86_64
  3. CentOS Linux release 7.0.1406 (Core) x86_64
  4. Fedora release 20 (Heisenbug) x86_64
  5. Ubuntu 10.04.4 LTS (lucid) x86_64
  6. Ubuntu 12.04.4 LTS (precise) x86_64
  7. Ubuntu 14.04 LTS (trusty) x86_64
  8. Debian GNU/Linux 7.5 (wheezy) x86_64

32 bit operating systems are not supported.

Have a question? Ask the experts!

Previous Post Next Post