proposal: archive/tar: implement fs.FS #61232

sding3 · 2023-07-07T17:06:20Z

archive/tar.Reader should implement fs.FS which puts it on par with archive/zip.Reader and it enables the tarball to be accessed through the fs.FS interface without requiring an intermediate step of inflating the tarball out to the filesystem.

The text was updated successfully, but these errors were encountered:

seankhliao · 2023-07-07T18:30:54Z

I don't think this is feasible with the current API and without in memory buffering: archive/tar.NewReader takes an io.Reader, it can only read the source once, plus tar is not a generally seekable archive format.

sding3 · 2023-07-07T19:10:01Z

I think we can manage that by checking to see if the io.Reader also implements io.Seeker or io.ReaderAt, and error out the fs.FS operations when the io.Reader cannot be sought.

apparentlymart · 2023-07-07T22:38:23Z

I think even if the given reader did also implement io.Seeker and/or io.ReaderAt this would still be a pretty significant change to the existing design of tar.Reader, which currently relies on the underlying reader's tracking of position within the file and doesn't retain any memory of entries that were already visited.

It would be possible in principle to make a separate type that wraps an io.ReadSeeker and implements fs.FS by making a fresh tar.Reader for each call and scanning over potentially the entire input on each operation, but that would be pretty inefficient in comparison to most other fs.FS implementations such that I'm skeptical that it would be widely used. Of course, if you implement it and publish it somewhere that other people can use it then that might prove me wrong. 😀

Do you have some specific ideas for how it would be implemented? If the proposal is to redesign tar.Reader's internals so that it can support random access by seeking and scanning (which it cannot do today), I expect you'd need to demonstrate significant demand for this capability to justify that additional complexity and the risk of significantly changing the existing working code. One way to achieve that would be to copy the contents of archive/tar into a separate library of your own and modify it to provide what you need and use that to demonstrate that it's feasible to implement and that the result is useful (hopefully it would be used by more than just you).

Based on experience with other proposals, I think you'd then also need to demonstrate that there's some benefit to it being in standard library rather than just remaining as a separate library maintained in your own repository.

neild · 2023-07-07T23:35:45Z

tar (Tape ARchive) is a streaming format. It's fundamentally unsuited for random access. You could build a random access reader, of course, but you'd be fighting the format and it would be a completely different implementation than archive/tar.

If you want random access within an archive, you're better off using zip, which is designed for the purpose.

sding3 · 2023-07-08T16:45:42Z

It looks like the zip format affords a "central directory" at the end of the file.

The first call of tar.Reader.Open() can build a similar "central directory" so the penalty is paid only once, and the godoc on that should make this clear.

Tar archives frequently appear as os.File, which supports random access (ReadAt).

sding3 · 2023-07-08T17:00:35Z

An alternative is to make a new constructor to allow for a more self-contained implementation and stronger distinction from the existing tar.Reader.

package tar

func FS(r io.ReaderAt, size int64) (fs.FS, error)

apparentlymart · 2023-07-08T22:43:23Z

An entirely new implementation that requires an io.ReaderAt and constructs a directory in memory does seem technically possible indeed, but it doesn't seem like it would have much in common with the implementation in archive/tar. You could implement it as a library outside of standard library to demonstrate how it would work, to see how much code it would potentially share with the existing archive/tar implementation, and to see whether it's widely useful enough to warrant potential inclusion in the standard library.

FWIW in my experience when I've used archive/tar I've typically passed it an archive/gzip.Reader wrapping an io.File rather than an io.File directly, because I most often interact with gzip-compressed tar files. I don't mean to say that there aren't situations where it's valuable to work directly with an uncompressed archive, but this demonstrates another important difference between tar and zip: the zip format uses separate compressed streams for each file, with the directory describing the location of each compressed stream, and so it's possible to perform streaming decompression of each file separately while reading it. tar doesn't support compression itself, and so the overall tar stream is typically compressed separately for efficient distribution.

seankhliao · 2023-07-08T22:52:53Z

there are existing implementations, neither of which seem popular enough to be worth adding into the standard library

https://pkg.go.dev/github.com/nlepage/go-tarfs
https://pkg.go.dev/github.com/quay/claircore/pkg/tarfs

sding3 · 2023-07-10T14:09:28Z

Thanks - closing this out.

rsc · 2023-08-09T21:42:15Z

This proposal has been declined as retracted.
— rsc for the proposal review group

sding3 added the Proposal label Jul 7, 2023

seankhliao changed the title ~~archive/tar: implement fs.FS~~ proposal: archive/tar: implement fs.FS Jul 7, 2023

gopherbot added this to the Proposal milestone Jul 7, 2023

seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jul 7, 2023

seankhliao added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jul 8, 2023

seankhliao removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jul 9, 2023

sding3 closed this as completed Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: archive/tar: implement fs.FS #61232

proposal: archive/tar: implement fs.FS #61232

sding3 commented Jul 7, 2023

seankhliao commented Jul 7, 2023 •

edited

sding3 commented Jul 7, 2023

apparentlymart commented Jul 7, 2023 •

edited

neild commented Jul 7, 2023

sding3 commented Jul 8, 2023

sding3 commented Jul 8, 2023

apparentlymart commented Jul 8, 2023

seankhliao commented Jul 8, 2023

sding3 commented Jul 10, 2023

rsc commented Aug 9, 2023

proposal: archive/tar: implement fs.FS #61232

proposal: archive/tar: implement fs.FS #61232

Comments

sding3 commented Jul 7, 2023

seankhliao commented Jul 7, 2023 • edited

sding3 commented Jul 7, 2023

apparentlymart commented Jul 7, 2023 • edited

neild commented Jul 7, 2023

sding3 commented Jul 8, 2023

sding3 commented Jul 8, 2023

apparentlymart commented Jul 8, 2023

seankhliao commented Jul 8, 2023

sding3 commented Jul 10, 2023

rsc commented Aug 9, 2023

seankhliao commented Jul 7, 2023 •

edited

apparentlymart commented Jul 7, 2023 •

edited