1. Take image from corrupted hard drive

Take image from corrupted hard drive

There are at least two cases when it makes sense to take an image from a corrupted hard drive as soon as possible: disk hardware errors and corrupted filesystem. Faulty hard drives can give just one chance to read a block, so there is no time for experiments. The similar picture with corrupted filesystems. Obviously something went wrong, it’s hard to predict how the operating system will behave next second and whether it will cause even more damage.

Save disk image to local storage

Probably the best and fastest way is to plug the faulty disk into a healthy server and save the disk image locally:

Where /dev/sdb is the faulty disk and faulty_disk.img is the image on the healthy /dev/sda disk.

conv=noerrror tells dd to continue reading even if read() call exited with an error. Thus dd will skip bad areas and dump as much information from the disk as possible.

By default dd reads 512 bytes and it is a good value. Reading larger blocks would be faster, but the larger block will fail even if a small portion of the block is unreadable. InnoDB page is 16k, so dd reads one page in eight operations. It’s possible to extract information even if the page is partially corrupt. So, reading in 512 bytes blocks seems to be optimal unless somebody convinces me in opposite.

Save disk image to remote storage

If the faulty disk can’t be unplugged the best (if not only) way is to save the disk image on a remote storage.

Netcat is an excellent tool for this purpose.

Start on the destination side a server:

On the server with the faulty disk take a dump and stream it over network

a.b.c.d is the IP address of the destination server.

Why dd is better for MySQL data recovery

There is a bunch of good file recovery or file undelete tools. However they serve slightly different purpose. In short they try to reconstruct a file. They care about a file system.

For MySQL data recovery we don’t need files, we need data. InnoDB page can be recognized by a short signature in the beginning of the page. In the fixed places there are two internal records in every index page infimum and supremum:

If the header is good then we know what table the page belongs to, how many records to expect etc. Even if the rest of the page is heavily corrupted it’s possible to extract all survived records.

I had several cases when dd excelled.
Story #1.

It was a dying hard drive. InnoDB crashed all the time. When a customer figured out the problem was with the disk they tried to copy MySQL file. But simple copy has failed. The customer had tried to read the files with some file recovery tool.

MySQL refused to start and reported checksum mismatched in the error log.

The customer provided the recovered files. Size of ibdata1 file was reasonable, but stream_parser has found ~20MB of pages. ibdata1 was almost empty inside – just all zeroes where the data should be. I doubt that even 40% of data was recovered.

Then we tried to take a dump of the disk and recover InnoDB tables from the image. First of all, there were found ~200MB of pages. Many tables were 100% recovered and around 80-90% records were fetched from corrupted tables.

Story #2.

A customer has dropped InnoDB database. MySQL was  running with innodb_file_per_table=ON. So, the tables were in .ibd file that were deleted. It was a Windows server and the customer used some tool to undelete the .ibd files from NTFS filesystem. The tool restored the files, but the ibd files were almost empty inside. The recovery rate was close to 20%.

Recovery from a disk dump gave around 70-80% of records.

Have a question? Ask the experts!

Previous Post Next Post
  • UPDATE (01/01/2017):

    We stopped further development of undrop-for-innodb and do not support its open source versions.