Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archive/tar, archive/zip: add ErrInsecurePath #55356

Open
neild opened this issue Sep 22, 2022 · 30 comments
Open

archive/tar, archive/zip: add ErrInsecurePath #55356

neild opened this issue Sep 22, 2022 · 30 comments

Comments

@neild
Copy link
Contributor

neild commented Sep 22, 2022

This is an alternative fix for #25849, as proposed by @dsnet in #25849 (comment).

The archive/tar and archive/zip readers return unsanitized paths from archives. Careless use of these paths leads to path traversal attacks.

Proposal:

An insecure filename is an absolute path, or a path containing a relative path component (../).
When tar.Reader.Next reads a file with an insecure filename, it returns tar.ErrInsecurePath.
When zip.NewReader opens an archive containing an insecure filename, it returns zip.ErrInsecurePath.
In both cases, the function also returns a usable object (a *tar.Header or *zip.Reader).

In the case where the caller wants to handle archives with insecure filenames, they may ignore the ErrInsecurePath error and perform whatever sanitization they find appropriate. If the caller takes no action, they get an error when processing an unsafe archive.

The advantage over automatically sanitizing filenames is that we don't silently change the semantics of archives. A tar archive may legitimately contain absolute path names; silently converting these to relative names seems more surprising than reporting an error. In addition, there isn't always an obvious sanitized name--we probably want to reject the name COM1 on Windows, but what would we rewrite it into?

@neild neild added the Proposal label Sep 22, 2022
@gopherbot gopherbot added this to the Proposal milestone Sep 22, 2022
@neild neild self-assigned this Sep 22, 2022
@neild neild added the Security label Sep 22, 2022
@dsnet
Copy link
Member

dsnet commented Sep 22, 2022

In addition, there isn't always an obvious sanitized name--we probably want to reject the name COM1 on Windows, but what would we rewrite it into?

It isn't clear to me whether the proposal includes returning ErrInsecurePath for a file named "COM1". I don't really see that as a security vulnerability since Windows won't let me create a file by that name anyways.

@neild
Copy link
Contributor Author

neild commented Sep 22, 2022

filepath.IsAbs considers COM1 and a number of other reserved names to be absolute:
https://go.googlesource.com/go/+/refs/tags/go1.19.1/src/path/filepath/path_windows.go#23

Using filepath does mean that whether an archive is insecure or not will depend on the platform we're executing on, but I think that's better than the alternatives. (On one hand, we shouldn't treat reserved names as safe on Windows; on the other hand, we shouldn't reject them elsewhere.)

@dsnet
Copy link
Member

dsnet commented Sep 22, 2022

Using filepath does mean that whether an archive is insecure or not will depend on the platform we're executing on

I find that behavior to be fairly surprising. Suppose someone were writing a service that you can send ZIP and TAR files to and it would report whether they are "safe". I would expect the answer given by the service to be agnostic to the platform it is running on. The example might seem esoteric, but it's not that far fetched from how Gmail auto-scans attachments for viruses.

@dsnet
Copy link
Member

dsnet commented Sep 22, 2022

In general, I would not expect archive/zip or archive/tar to depend on path/filepath for anything directly used by the Reader or Writer.

The only OS-specific behavior is isolated within functionality that deal with fs.FileInfo (e.g., zip.FieInfoHeader and tar.FileInfoHeader).

@bcmills
Copy link
Contributor

bcmills commented Sep 23, 2022

@dsnet

I don't really see that as a security vulnerability since Windows won't let me create a file by that name anyways.

I expect that many users of the archive packages use os.WriteFile to write their contents. Unfortunately, neither os.Create nor os.WriteFile sets O_EXCL when creating the file — would it silently write the contents to COM1 instead of failing?

@neild
Copy link
Contributor Author

neild commented Sep 23, 2022

So far as I can see, we can:

  1. Call an archive containing the file com1 insecure on all systems. This doesn't seem right; Unix systems shouldn't need to worry about Windows special file names.
  2. Call an archive containing the file com1 secure on Windows. This is definitely not right; our goal is to ensure that os.WriteFile with an archive filename is safe by default.
  3. Accept inconsistency and apply OS-specific behavior.

Of these options, OS-specific behavior seems the least objectionable to me.

@bcmills
Copy link
Contributor

bcmills commented Sep 23, 2022

There are also variations on (1) and (3) that we could consider: we could add a field or method to the Reader structs to allow callers to modify the validation hook.

1a. Call an archive containing the file com1 insecure by default on all systems, but make it easy to configure to accept those names on non-Windows platforms.
3a. Accept inconsistency and apply OS-specific behavior by default, but make it easy to configure to reject names that are insecure on any system.

(1a) provides better default security, but (3a) provides better backward-compatibility for existing users on Unix. 🤔

@bcmills
Copy link
Contributor

bcmills commented Sep 23, 2022

When zip.NewReader opens an archive containing an insecure filename, it returns zip.ErrInsecurePath.

Hmm. It's not clear to me how would override that behavior if (for example) one is writing the data to an isolated in-memory database rather than the local filesystem. Would it make sense to return the error from the *File methods instead (Open, OpenRaw, and maybe DataOffset), so that the caller has an opportunity to set an override of some kind in the FIle struct ahead of the call?

@neild
Copy link
Contributor Author

neild commented Sep 23, 2022

We would return ErrInsecurePath when encountering an insecure filename, but permit the user to ignore it. So:

zr, err := zip.NewReader(r, size)
if err == zip.ErrInsecurePath {
  err = nil // we will perform our own path sanitization
}
if err != nil {
  return err
}

@dsnet
Copy link
Member

dsnet commented Sep 26, 2022

Personally, I'm more of a fan of option 1 in #55356 (comment).

Even though ZIP has its heritage in Windows, and TAR has its heritage in Unix, there's no doubt that they've functionally become cross-platform archive formats when people construct archives on one system and pass it to another system. I'm just not comfortable with the idea of a ZIP or TAR file being valid solely based on what OS it is executing on.

That said, I haven't studied in depth the implications of option 2:

  • For example, is "path/to/COM1" just as dangerous "COM1"? I could imagine "COM1" by itself providing access to the serial port, but "path/to/COM1" not providing that behavior where the file simply can't be written to because Windows has some hardcoded limitation that files can't be named COM1. The behavior of filepath.IsAbs seems to support that interpretation since IsAbs("path/to/COM1") reports false, while IsAbs("COM1") reports true. If we assume that a ZIP file is always written into a directory, then isn't that semantically equivalent to everything being like "path/to/COM1"?

@neild
Copy link
Contributor Author

neild commented Sep 26, 2022

https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file states:

Do not use the following reserved names for the name of a file:

CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9. Also avoid these names followed immediately by an extension; for example, NUL.txt is not recommended.

I'm not certain if WriteFile("./com1", b, mode) is an error or just a bad idea. I'm pretty sure it doesn't write to the COM1 device, although I haven't tested this.

If we're willing to do path sanitization on archive filenames, perhaps we can convert these reserved filenames into safe paths such as "./COM1". If we're going the sanitization route, however, then perhaps we should sanitize all filenames and add an InsecureName field containing the unsanitized name as proposed in #25849.

@jimmyfrasche
Copy link
Member

Apparently those names are global as they predate the introduction of directories in DOS (!!!) https://devblogs.microsoft.com/oldnewthing/20031022-00/?p=42073

@rsc
Copy link
Contributor

rsc commented Sep 28, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Oct 5, 2022

The general approach sounds right but I'm not sure we can make the behavior completely portable. What do we do about paths with backslashes or colons? Are those insecure on all systems? It seems like they can't be treated the same on all systems. A file named \a\b is fine on Unix, as is com1.

Programs that really need to defend against cross-operating-system problems will still need their own rules, like the go command refuses to have 'com1' in a module path.

@rsc
Copy link
Contributor

rsc commented Oct 5, 2022

Also what about symlinks? Do we reject symlinks containing absolute or other "insecure" paths? A tar file could contain

x -> /

followed by

x/etc/passwd

@neild
Copy link
Contributor Author

neild commented Oct 5, 2022

This proposal doesn't address symlinks.

The vast majority of path vulnerabilities caused by the current archive/tar and archive/zip behavior have been in programs that don't process symlinks. Perhaps we should do something to address the case of an archive that creates a symlink and then writes to it, but addressing the simpler case of path traversal via filenames gets us 99% of the way there and is worth doing on its own.

@martin-sucha
Copy link
Contributor

It seems to me that a lot of those security issues emerged because we don't provide an easy, secure way to extract an archive out of the box, at least for the simple case of extracting (a subset of) regular files and directories. However, adding such facility would be a completely separate proposal and wouldn't help with the existing insecure code out in the wild now. As such, I think adding ErrInsecurePath error is reasonable.

It seems to me that the path checking should use OS-specific behavior. We want to guard against paths that are written to directly or when used with filepath.Join. COM1 should be disallowed on Windows and allowed on Unix-like systems. filepath.IsAbs returns true for filenames starting with # when running on Plan 9 and # is safe filename on Linux and Windows. We probably don't want to disallow filenames starting with # on all systems. More generally, if we had the handling the same on all OSes, suppose we are about to introduce a port to another operating system that handles certain other paths specially, do we possibly break backward compatibility for all the existing OSes again or do we not handle certain insecure paths on that newly supported system?

The zip format specifies that the filenames:

  • must not contain a drive or device letter or a leading slash
  • all slashes must be forward slashes

It seems that checking the above for ZIP files should be safe on all platforms. Of course additional checks for .. and platform-specific checks will be needed on top of that.

If we add ErrInsecurePath we need to be really clear in the documentation that the fact that ErrInsecurePath was not returned does not mean that extracting the entry is safe (because of the symlinks) and callers still need to be careful.
Specifically archive/zip documentation currently warns users about some of the unsafe cases, but archive/tar does not. Neither mentions the Windows global filenames or UNC paths (although it could be argued those are part of the absolute path check). Both package docs should mention all those cases.

@rsc
Copy link
Contributor

rsc commented Oct 12, 2022

@martin-sucha's example about #c/cons being invalid on Plan 9 but valid elsewhere seems like a good argument for OS-dependent behavior.

Assuming we make the disallow list OS-dependent, does anyone object to adding ErrInsecurePath and accepting this proposal?

@dsnet
Copy link
Member

dsnet commented Oct 12, 2022

It's not the behavior I would prefer, but SGTM. As @neild noted in #55356 (comment), there just doesn't seem like many good options.

@FiloSottile
Copy link
Contributor

This is great! One opinion on the color of the bike shed.

I feel like surfacing the file handle as part of the error would be cleaner than returning a handle and a non-nil error.

That is, having a *tar.Header field in tar.ErrInsecurePath and a *zip.Reader field in zip.ErrInsecurePath.

@neild
Copy link
Contributor Author

neild commented Oct 13, 2022

I'm not certain if WriteFile("./com1", b, mode) is an error or just a bad idea. I'm pretty sure it doesn't write to the COM1 device, although I haven't tested this.

I have now tested this, and this does indeed write to the COM1 device. So does WriteFile("/some/path/to/com1.extension"). (See #56217.)

I think this makes it clear that filename sanitization needs to be OS-dependent; the set of filenames which are unsafe on Windows but perfectly valid elsewhere is large.

Assuming #56217 is resolved by changing filepath.IsAbs to return true for all reserved filenames, including ones like a/nul.b, then this has no impact on this proposal, since I propose treating absolute filenames as insecure.

@neild
Copy link
Contributor Author

neild commented Oct 13, 2022

I feel like surfacing the file handle as part of the error would be cleaner than returning a handle and a non-nil error.

This seems substantially less convenient for the user.

for {
  h, err := tarReader.Next()
  // Ignore ErrInsecurePath, we know what we're doing.
  if err != nil && err != tar.ErrInsecurePath {
    return err
  }
  // ...
}

vs.

for {
  h, err := tarReader.Next()
  if e, ok := err.(*tar.ErrInsecurePath); ok {
    // Extract the header from the error.
    h = e.Header
  } else if err != nil {
    return err
  }
  // ...
}

@rsc
Copy link
Contributor

rsc commented Oct 20, 2022

The current behavior suggested by @neild is also the same kind of error-clearing that we do in os/exec with ErrDot.

@rsc
Copy link
Contributor

rsc commented Oct 20, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Oct 26, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: archive/tar, archive/zip: add ErrInsecurePath archive/tar, archive/zip: add ErrInsecurePath Oct 26, 2022
@rsc rsc modified the milestones: Proposal, Backlog Oct 26, 2022
@gopherbot
Copy link

Change https://go.dev/cl/449937 mentions this issue: archive/tar, archive/zip: return ErrInsecurePath for unsafe paths

@dmitshur dmitshur modified the milestones: Backlog, Go1.20 Nov 21, 2022
@gopherbot
Copy link

Change https://go.dev/cl/452495 mentions this issue: archive/tar, archive/zip: disable insecure file name checks with GODEBUG

@neild neild reopened this Nov 21, 2022
@neild
Copy link
Contributor Author

neild commented Nov 21, 2022

It appears that Docker images frequently contain tar files containing absolute paths. This change causes problems for programs which operate on these tar files. Since Docker is not a small or uncommon use case, reopening this issue to evaluate whether this influences our willingness to make this change.

One obvious point is that there should be a GODEBUG flag to restore the original behavior; https://go.dev/cl/452495 adds GODEBUG=tarinsecurepath=1 and GODEBUG=zipinsecurepath=1 to disable the insecure-path check.

Possible courses of action I see here:

  • Drop this change, leave everything alone. I don't think this is a viable path; there's too much evidence that the existing behavior of returning insecure paths without warning is hazardous.
  • Leave existing APIs alone, add new secure ones, deprecate the insecure APIs. We could, for example, add zip.NewReaderWithOptions and tar.NewReaderWithOptions, where the options include settings for whether to detect unsafe paths. If we want to go this path, we should revert the ErrInsecurePath change while we figure out the correct new API.
  • Keep the ErrInsecurePath change, but make it opt-in for a cycle. This gives the ecosystem more time to adjust to the change before turning it on.
  • Proceed as we are now: Keep the change, permit opting out.

gopherbot pushed a commit that referenced this issue Nov 21, 2022
Add GODEBUG=tarinsecurepath=1 and GODEBUG=zipinsecurepath=1 settings
to disable file name validation.

For #55356.

Change-Id: Iaacdc629189493e7ea3537a81660215a59dd40a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/452495
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
felixge pushed a commit to felixge/go that referenced this issue Nov 21, 2022
Add GODEBUG=tarinsecurepath=1 and GODEBUG=zipinsecurepath=1 settings
to disable file name validation.

For golang#55356.

Change-Id: Iaacdc629189493e7ea3537a81660215a59dd40a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/452495
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/452616 mentions this issue: archive/tar, archive/zip: disable ErrInsecurePath by default

gopherbot pushed a commit that referenced this issue Nov 22, 2022
This change is being made late in the release cycle.
Disable it by default. Insecure path checks may be enabled by setting
GODEBUG=tarinsecurepath=0 or GODEBUG=zipinsecurepath=0.
We can enable this by default in Go 1.21 after publicizing the change
more broadly and giving users a chance to adapt to the change.

For #55356.

Change-Id: I549298b3c85d6c8c7fd607c41de1073083f79b1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/452616
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/458875 mentions this issue: archive/tar, archive/zip: revert documentation of ErrInsecurePath

gopherbot pushed a commit that referenced this issue Dec 21, 2022
CL 452616 disables path security checks by default, enabling them
only when GODEBUG=tarinsecurepath=0 or GODEBUG=zipinsecurepath=0
is set. Remove now-obsolete documenation of the path checks.

For #55356

Change-Id: I4ae57534efe9e27368d5e67773a502dd0e56eff4
Reviewed-on: https://go-review.googlesource.com/c/go/+/458875
Reviewed-by: Russ Cox <rsc@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Damien Neil <dneil@google.com>
@gopherbot gopherbot modified the milestones: Go1.20, Go1.21 Feb 1, 2023
@gopherbot gopherbot modified the milestones: Go1.21, Go1.22 Aug 8, 2023
@odeke-em odeke-em modified the milestones: Go1.22, Go1.23 Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

10 participants