Skip to content

zdb: better handling for corrupt block pointers #17166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

asomers
Copy link
Contributor

@asomers asomers commented Mar 21, 2025

When dumping indirect blocks, attempt to print corrupt block pointers rather than abort the program.

Sponsored by: ConnectWise
Signed-off-by: Alan Somers asomers@gmail.com

Motivation and Context

When trying to print a file's indirect blocks with "zfs -vvbbbb", if that file contains corrupt block pointers zdb will fail assertions and crash.

Description

Remove three assertions, replacing them with warnings printed to stderr. The rest of zdb does not require those block pointers to be intact.

How Has This Been Tested?

Tested with a dataset that had become corrupted, similarly to #17077 .

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

When dumping indirect blocks, attempt to print corrupt block pointers
rather than abort the program.

Sponsored by:	ConnectWise
Signed-off-by:	Alan Somers <asomers@gmail.com>
Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that assertions are not good for the data we read from disks and that might get corrupted, but I am not sure just complaining to stderr is a good solution either. Error messages printed to stderr might be difficult to correlate to the output printed to stdout. Also since we use zdb as a pool verification tool in some CI tests, I am not sure spamming stderr without returning actual error status would be noticed.

@asomers
Copy link
Contributor Author

asomers commented Mar 24, 2025

I agree that assertions are not good for the data we read from disks and that might get corrupted, but I am not sure just complaining to stderr is a good solution either. Error messages printed to stderr might be difficult to correlate to the output printed to stdout. Also since we use zdb as a pool verification tool in some CI tests, I am not sure spamming stderr without returning actual error status would be noticed.

So what would you prefer? I could print the messages to stdout instead, and eventually exit with status 1.

@amotin
Copy link
Member

amotin commented Mar 24, 2025

Exit code would definitely be good. How to better print it I am not sure though. I haven't looked close on that part of the code. But I already saw zdb using stderr for non-error messages, making it useless as error indicator. And as I have noted their correlation with stdout was a pain. So I'd say if we shoudl print the pointer, lets integrate printing of invalid values there also.

Signed-off-by: Alan Somers <asomers@gmail.com>
@asomers
Copy link
Contributor Author

asomers commented Apr 7, 2025

The latest commit prints all of the errors to stdout instead of stderr. It wasn't possible to print the L1 block's fill error on the same line as the block pointer itself, at least not without buffering potentially massive amounts of text in RAM, so it's printed afterwards. The output looks like this:

       a90000000   L1  DVA[0]=<5:4088dc7a0000:15000> DVA[1]=<2:3a8fade2f000:15000> [L1 ZFS plain file] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/c000P birth=7970383L/7970383P fill=16305515943967419494 cksum=000010c4104a06a3:01b38f9c940ef690:543c183d6a120c26:3f99c20b19c3eb3e
...
       a93940000    L0 DVA[0]=<14483840:7928e65e665e1a00:e200000> DVA[1]=<0:148154a00000000:1c39e00> DVA[2]=<8394880:0:1000000> [L0 unallocated] inherit inherit unencrypted BE gang unique triple size=200L/200P birth=1158310211095814315L/4409647458288664576P fill=6480302737822677281 cksum=0a4c2450925129cd:0124e955ddeccd13:0d3c3894ff64c1dd:c5d48c69c7b34245(ERROR: Block pointer type (0) does not match dnode type (19))
...
       a93a60000    L0 HOLE [L0 unallocated] size=200L birth=12694941463093612928L(ERROR: Block pointer type (0) does not match dnode type (19))
...
       a90000000: ERROR: Block pointer fill (16305515943967419494) does not match calculated value (4160680085501909892)
       a94000000   L1  DVA[0]=<0:3de2940b4000:15000> DVA[1]=<4:4174597d5000:15000> [L1 ZFS plain file] fletcher4 lz4 unencrypted LE contiguous unique double size=20000L/c000P birth=7974332L/7974332P fill=897 cksum=0000108206fb0d15:019223eac4ced4e5:0d3002f5ea989522:7045956156f0242b

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is quite verbose. I wonder what happen with some code that may try to parse the output (I think we have some in ZTS), and without error status returned still it might be difficult to handle properly.

@asomers
Copy link
Contributor Author

asomers commented Apr 9, 2025

It is quite verbose. I wonder what happen with some code that may try to parse the output (I think we have some in ZTS), and without error status returned still it might be difficult to handle properly.

I could change the exit status to 1 if any corruption is found. However, there is precedent for not doing that. zdb will already report corruption and yet exit 0 in the following conditions:

  • unmatched FREE(s)
  • Livelist ALLOC ... from TXG ... FREED at TXG ...
  • DOUBLE ALLOC
  • DOUBLE FREE
  • Found block that crosses metaslab boundary
  • Found livelist blocks marked as allocated for indirect vdevs
  • while trying to open subobj id
  • path not found, possibly leaked
  • potentially leaked objects detected

So for consistency's sake, I think I should leave the exit status unchanged.

ASSERT3U(fill, ==, BP_GET_FILL(bp));
if (!err) {
if (fill != BP_GET_FILL(bp)) {
(void) printf("%16llx: Block pointer "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print to stderr instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what my first commit did. But I switched to stdout in response to @amotin's comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think printing to stderr makes sense only if we exit immediately, so it is the last message before we return error status. But if we continue, then using different stream makes it difficult to understand what's actually wrong. Om the other side, if we print it to stdout, then it has to be formatted in some way nice to possible parsers or readers, unless we expect them to ignore all the output after getting an errors.

@asomers
Copy link
Contributor Author

asomers commented Apr 22, 2025

As discussed in the OpenZFS leadership meeting today, we agreed to change zdb to always exit non-zero if it detects any corruption. I'll create a distinct error code for that.

asomers added 2 commits April 23, 2025 08:46
Signed-off-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alan Somers <asomers@gmail.com>
@asomers
Copy link
Contributor Author

asomers commented Apr 23, 2025

@amotin @allanjude with the latest commits, zdb will exit 3 if on-disk corruption was detected. It will also follow the usual convention of exiting 2 due to invalid CLI usage.

Copy link
Member

@amotin amotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just randomly looking around the handled cases I found couple more.

@@ -439,6 +442,7 @@ metaslab_spacemap_validation_cb(space_map_entry_t *sme, void *arg)
(u_longlong_t)txg, (u_longlong_t)offset,
(u_longlong_t)size, (u_longlong_t)mv->mv_vdid,
(u_longlong_t)mv->mv_msid);
corruption_found = B_TRUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 13 lines up is also a case of double alloc.

@@ -542,6 +546,7 @@ mv_populate_livelist_allocs(metaslab_verify_t *mv, sublivelist_verify_t *sv)
(u_longlong_t)DVA_GET_VDEV(&svb->svb_dva),
(u_longlong_t)DVA_GET_OFFSET(&svb->svb_dva),
(u_longlong_t)DVA_GET_ASIZE(&svb->svb_dva));
corruption_found = B_TRUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar error 17 lines up.

Signed-off-by: Alan Somers <asomers@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants