1. How to Handle Wrong Page Type In External Pages

How to Handle Wrong Page Type In External Pages

Mainly, the first step to successful MySQL data recovery is finding InnoDB pages with your data. Let’s call it the first step, because prerequisite steps are already done.

An InnoDB page type is a two bytes integer in the header of a page. For MySQL data recovery two of them are important:

  • FIL_PAGE_INDEX. Pages of this type are nodes of B+ Tree index where InnoDB stores a table.
  • FIL_PAGE_TYPE_BLOB. So called external pages, where InnoDB keeps long values of BLOB or TEXT type.

Therefore, stream_parser reads a stream of bytes, finds InnoDB pages and sorts them by type, index, or page id. It applies sophisticated algorithms tailored for a particular page type. Of course, it assumes that page type in the header corresponds to the content of the page, otherwise it ignores the page.

Recently, I worked on a data recovery case that proved I was wrong in my assumptions. The customer dropped their database. They ran MySQL 5.0 with innodb_file_per_table=OFF. This is one of the easiest recovery scenarios, however, we couldn’t recover everything – the most important table was missing BLOB fields. Total number of recovers was close to the true value, but BLOB fields weren’t complete. Excessive “— #####CannotOpen_FIL_PAGE_TYPE_BLOB/0000000000000XYZ.page”” errors in the standard error output proved that stream_parser failed to find external pages.

Something went wrong, overwriting data was impossible, the customer stopped MySQL immediately after the accident. I decided to investigate why external pages couldn’t be found. Page_id is a file offset in 16k units. dd can extract particular page X:

# dd if=/var/lib/mysql/ibdata1 of=page-8 bs=16k count=1 skip=8
1+0 records in
1+0 records out
16384 bytes (16 kB) copied, 0.00175834 s, 9.3 MB/s

Here’s the header of page id 8, which is the index page:

# hexdump -C page-8 | head
00000000  9a 8f cd fc 00 00 00 08  ff ff ff ff ff ff ff ff  |................|
00000010  00 00 00 01 00 1b 3a 1c  45 bf 00 00 00 00 00 00  |......:.E.......|
00000020  00 00 00 00 00 00 00 02  00 b9 00 04 00 00 00 00  |................|
00000030  00 9e 00 02 00 01 00 02  00 00 00 00 00 00 00 00  |................|
00000040  00 01 00 00 00 00 00 00  00 01 00 00 00 00 00 00  |................|
00000050  00 02 03 f2 00 00 00 00  00 00 00 02 03 32 08 01  |.............2..|
00000060  00 00 03 00 85 69 6e 66  69 6d 75 6d 00 09 03 00  |.....infimum....|
00000070  08 03 00 00 73 75 70 72  65 6d 75 6d 00 11 0d 10  |....supremum....|
00000080  00 10 05 00 9e 53 59 53  5f 44 41 54 41 46 49 4c  |.....SYS_DATAFIL|
00000090  45 53 00 00 01 a3 1b 17  00 00 18 05 00 74 73 74  |ES...........tst|

The FIL_PAGE_INDEX constant is defined as 17855 in MySQL. In hexadecimal, it’s 0x45BF. It’s at position 0x18 in the example above.

And here’s an example of the external page:

# hexdump -C 0000000000001414.page | head
00000000  0a 4e 9d 5c 00 00 05 86  00 00 00 00 00 00 00 00  |.N.\............|
00000010  00 00 00 00 00 6b 72 b9  00 0a 00 00 00 00 00 00  |.....kr.........|
00000020  00 00 00 00 00 00 00 00  22 42 ff ff ff ff 22 30  |........"B...."0|
00000030  22 3e 3c 2f 46 4f 4e 54  3e 3c 2f 50 3e 53 49 5a  |"></FONT></P>SIZ|
00000040  45 3d 22 31 22 20 41 4c  49 47 4e 3d 22 4c 45 46  |E="1" ALIGN="LEF|
00000050  54 22 3e 3c 46 4f 4e 54  20 46 41 43 45 3d 22 56  |T"><FONT FACE="V|
00000060  65 72 64 61 6e 61 22 20  53 49 5a 45 3d 22 31 22  |erdana" SIZE="1"|
00000070  20 43 4f 4c 4f 52 3d 22  23 30 30 30 30 33 33 22  | COLOR="#000033"|
00000080  20 4c 45 54 54 45 52 53  50 41 43 49 4e 47 3d 22  | LETTERSPACING="|
00000090  30 22 20 4b 45 52 4e 49  4e 47 3d 22 30 22 3e 42  |0" KERNING="0">B|

Page type is 0x0A, as it should be.

When I extracted a page that stream_parser couldn’t find, it became clear why. The page type was 0x45BF! The page was a BLOB page, but the page type in the header was FIL_PAGE_INDEX.

How can you detect an InnoDB page if MySQL lies about its type? I believe the solution of this problem does exist, but for now there’s a workaround.

c_parser by default reads external pages from directory specified by -d option:

# ./c_parser

Error: Usage: ./c_parser -4|-5|-6 [-dDV] -f <InnoDB page or dir> -t table.sql [-T N:M] [-b <external pages directory>]

...

-b <dir> -- Directory where external pages can be found. Usually it is pages-XXX/FIL_PAGE_TYPE_BLOB/

The -i option helps read external pages from a file (e.g. ibdata1):

    -i  -- Read external pages at their offsets from <file>.

After this trick, the table was successfully recovered.

Have a question? Ask the experts!

Previous Post Next Post