1. Taking An Image From Corrupted Hard Drive

Taking An Image From Corrupted Hard Drive

There are at least two cases when it makes sense to take an image from a corrupted hard drive as soon as possible: disk hardware errors and a corrupted filesystem. Faulty hard drives can give just one chance to read a block, so there is no time for experiments. It’s pretty much the same with corrupted filesystems. Obviously, something went wrong, it’s hard to predict how the operating system will behave the next second and whether it will cause even more damage.

Save Disk Image To Local Storage

Probably the best and fastest way is to plug the faulty disk into a healthy server and save the disk image locally:

Where /dev/sdb is the faulty disk and faulty_disk.img is the image on the healthy /dev/sda disk.

Conv=noerrror tells dd to continue reading even if the read() call exited with an error. Thus, dd skips bad areas and dumps as much information from the disk as possible.

By default, dd reads 512 bytes and it’s a good value. Reading larger blocks would be faster, but the larger block will fail even if a small portion of the block is unreadable. An InnoDB page is 16k, so dd reads one page in eight operations. It’s possible to extract information even if the page is partially corrupt. So, reading in 512 bytes blocks seems to be optimal unless somebody convinces me of the opposite.

Save Disk Image To Remote Storage

If the faulty disk can’t be unplugged, the best (if not the only) way is to save the disk image on a remote storage.

Netcat is an excellent tool for this purpose.

Start on the destination side a server:

Dump the server with the faulty disk and stream it over the network:

a.b.c.d is the IP address of the destination server.

Why dd Is Better For MySQL Data Recovery

There is a bunch of good file recovery or file undelete tools. However, they serve a slightly different purpose. In short, they try to reconstruct a file. They care about the file system.

For MySQL data recovery we don’t need files, we need data. An InnoDB page can be recognized by a short signature in the beginning of the page. There are two internal records in fixed places on every index page infimum and supremum:

If the header is good then we know what table the page belongs to, how many records to expect, etc. Even if the rest of the page is heavily corrupted it’s possible to extract all surviving records.

I had several cases when dd excelled.

Story #1.

It was a dying hard drive. InnoDB crashed all the time. When the customer figured out that the problem was with the disk, they tried to copy the MySQL file. But simple copy failed. The customer had tried to read the files with a file recovery tool.

MySQL refused to start and reported checksum mismatched in the error log.

The customer provided the recovered files. The size of ibdata1 file was reasonable, but stream_parser has found ~20MB of pages. Ibdata1 was almost empty inside – just all zeroes where the data should have been. I doubt that even 40% of data was recovered.

Then, we tried to dump the disk and recover InnoDB tables from the image. First of all, ~200MB of pages were found. Many tables were 100% recovered and about 80-90% of records were fetched from corrupted tables.

Story #2.

A customer has dropped the InnoDB database. MySQL was running with innodb_file_per_table=ON. So, the tables were in the .ibd file that was deleted. It was a Windows server and the customer used a tool to undelete the .ibd files from the NTFS filesystem. The tool restored the files, but the ibd files were almost empty inside. The recovery rate was close to 20%.

Recovery from a disk dump gave about 70-80% of records.

Have a question? Ask the experts!

Previous Post Next Post