-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Truncated paths when unarchiving #39
Comments
charliermarsh
pushed a commit
to astral-sh/tokio-tar
that referenced
this issue
Feb 9, 2025
…name truncation (#1) This is from edera-dev/tokio-tar#3 --- I tracked down this issue in astral-sh/uv#5450 (comment) > https://github.com/edera-dev/tokio-tar/blob/4ee357285b5053e6bfada7f117e530b4da94b74a/src/archive.rs#L317 > > ```rust > if is_recognized_header && entry.header().entry_type().is_pax_local_extensions() { > if self.pax_extensions.is_some() { > return Poll::Ready(Some(Err(other( > "two pax extensions entries describing \ > the same member", > )))); > } > let mut ef = EntryFields::from(entry); > let val = ready_err!(Pin::new(&mut ef).poll_read_all(cx)); > self.pax_extensions = Some(val); > continue; > } > ``` > > if `Pin::new(&mut ef).poll_read_all(cx)` is `Poll::Pending` then `ready_err!` returns it, so the Pax extension is lost. The same would apply to a pending poll that occurs while a > longlink or longname is being prepared. When `poll_next` is called again the next entry header is parsed. This PR demonstrates the issue by creating an AsyncRead impl which pends every second time it is polled. Commenting out [this line](https://github.com/RazerM/tokio-tar/blob/15466052f63c47cf47decd4409a9b0e936302773/tests/all.rs#L816) makes the test pass, because the reader doesn't enter a pending state in the "wrong" place. It is probably also the cause of dignifiedquire/async-tar#39
charliermarsh
pushed a commit
to astral-sh/tokio-tar
that referenced
this issue
Feb 9, 2025
…name truncation (#1) This is from edera-dev/tokio-tar#3 --- I tracked down this issue in astral-sh/uv#5450 (comment) > https://github.com/edera-dev/tokio-tar/blob/4ee357285b5053e6bfada7f117e530b4da94b74a/src/archive.rs#L317 > > ```rust > if is_recognized_header && entry.header().entry_type().is_pax_local_extensions() { > if self.pax_extensions.is_some() { > return Poll::Ready(Some(Err(other( > "two pax extensions entries describing \ > the same member", > )))); > } > let mut ef = EntryFields::from(entry); > let val = ready_err!(Pin::new(&mut ef).poll_read_all(cx)); > self.pax_extensions = Some(val); > continue; > } > ``` > > if `Pin::new(&mut ef).poll_read_all(cx)` is `Poll::Pending` then `ready_err!` returns it, so the Pax extension is lost. The same would apply to a pending poll that occurs while a > longlink or longname is being prepared. When `poll_next` is called again the next entry header is parsed. This PR demonstrates the issue by creating an AsyncRead impl which pends every second time it is polled. Commenting out [this line](https://github.com/RazerM/tokio-tar/blob/15466052f63c47cf47decd4409a9b0e936302773/tests/all.rs#L816) makes the test pass, because the reader doesn't enter a pending state in the "wrong" place. It is probably also the cause of dignifiedquire/async-tar#39
charliermarsh
added a commit
to astral-sh/tokio-tar
that referenced
this issue
Feb 9, 2025
## Summary Right now, if we hit a pending read while reading an entry, we end up discarding the data rather than preserving it for the next poll (e.g., for a PAX extension). You can also see this reported at dignifiedquire/async-tar#39. This PR takes dignifiedquire/async-tar#55, but applies an additional change as that PR didn't work on its own, in my testing. Atop dignifiedquire/async-tar#55, we also store the pending `Entry` to ensure that if we're pending, we don't advance to the next entry on the next poll. For more context, see: #1.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello folks!
While unpacking an async-tar generated archive, file paths got truncated when they were over 100 chars. This seemed strange since macOS could unpack the archive correctly, and so did
tar-rs
and other tools.Full context here: https://discord.com/channels/273534239310479360/1045944060650717204
Here's the header of the archive:

I can see there that the file path (
3rdparty/https/hex.pm/packages/decimal/_build/default/lib/decimal/consolidated/Elixir.Hex.Solver.Constraint.beam
) is complete, but at some point during the read process, it gets lost.I tried also iterating over the
.entries
and printing outentry.path
,entry.path_bytes
,entry.header.path
andentry.header.path_bytes
, and they all have the truncated file path:3rdparty/https/hex.pm/packages/decimal/_build/default/lib/decimal/consolidated/Elixir.Hex.Solver.Con
.Thanks @rrbrussell for the help debugging this 👋🏽
The text was updated successfully, but these errors were encountered: