Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory #58625

Open
bcmills opened this issue Feb 21, 2023 · 47 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Feb 21, 2023

Background

The documentation for unsafe.Pointer currently lists six valid conversion patterns:

  1. “Conversion of a *T1 to Pointer to *T2 … [p]rovided that T2 is no larger than T1 and that the two share an equivalent memory layout.”

  2. “Conversion of a Pointer to a uintptr (but not back to Pointer).”

  3. “Conversion of a Pointer to a uintptr and back, with arithmetic … in the same expression, with only the intervening arithmetic between them.”

  4. “Conversion of a Pointer to a uintptr when calling syscall.Syscall … [or] in the argument list of a call to a function implemented in assembly.” (Compare proposal: unsafe: clarify unsafe.Pointer rules for package syscall #34684.)

  5. “Conversion of the result of reflect.Value.Pointer or reflect.Value.UnsafeAddr from uintptr to Pointer … immediately after making the call, in the same expression.”

  6. “Conversion of a reflect.SliceHeader or reflect.StringHeader Data field to or from Pointer … when interpreting the content of an actual slice or string value.”

The unsafeptr check provided by cmd/vet warns about uses that do not follow the above patterns. However, it currently flags many violations in x/sys/unix (see #41205) when addresses returned by system calls such as mmap are converted to Go pointers, and in code generated by ebitengine/purego (see #56487) when hard-coded addresses provided by the operating system or linker are converted to Go pointers. In both cases, the program is attempting to create Go pointers that refer to known-valid addresses that are not managed by the Go runtime; notably, the compiler's -d=checkptr mode does not flag them as invalid at run-time.

Ideally, the unsafe.Pointer documentation, the unsafeptr check in cmd/vet, the -d=checkptr mode in cmd/compile, and the real-world usage in syscall, x/sys, and similar low-level libraries should all agree on what is valid. This proposal aims to narrow that gap.

Proposal

I propose that we add another allowed case in the unsafe.Pointer documentation:

(7) Conversion of a uintptr to Pointer when the address is allocated outside of Go.

A uintptr containing a valid memory address allocated outside of Go
(such as by a system call) may be converted to Pointer.
The address must remain valid for as long as any Go pointer (of type Pointer
or any other pointer type) refers to it.

The address must remain unavailable to the Go runtime (for example, due to an
allocation using cgo or syscall.Mmap) for as long as any Go pointer
(of type Pointer or any other pointer type) refers to it.
[edited per https://github.com/golang/go/issues/58625#issuecomment-1440188404\]

The uintptr constant 0 may be converted to Pointer.
The resulting Pointer has the value nil.

addr, _, err := syscall.Syscall(…)
…
p := unsafe.Pointer(addr)

The unsafeptr check in cmd/vet would be changed to allow the new case, resolving the warnings in x/sys and ebitengine/purego.

The compiler's -d=checkptr mode would check conversions from uintptr to unsafe.Pointer. The conversion may throw at run-time if:

Alternatives

The vet warning can be worked around today by relying on a liberal reading of the “equivalent memory layout” rule, rewriting

var addr uintptr =p := unsafe.Pointer(addr)

(https://go.dev/play/p/ZWZxv6URqTW) as

var addr uintptr =p := *(*unsafe.Pointer)(unsafe.Pointer(&addr))

(https://go.dev/play/p/Wh5f8k0_yyR), on the theory that the memory layout of a uintptr must be in some sense “equivalent” to the memory layout of unsafe.Pointer. However, I believe that such a rewrite does not capture the intent of the code as accurately, and should not be necessary.

(CC @golang/runtime)

@gopherbot gopherbot added this to the Proposal milestone Feb 21, 2023
@bcmills bcmills added the compiler/runtime Issues related to the Go compiler and/or runtime. label Feb 21, 2023
@prattmic
Copy link
Member

prattmic commented Feb 21, 2023

Overall, this looks good to me. A few minor questions:

The address must remain valid for as long as any Go pointer (of type Pointer or any other pointer type) refers to it.

Do we need this requirement? I don't think the GC will follow non-Go pointers, so I think this should be fine as long the application doesn't dereference the pointer when it is invalid. (I am imagining a library that temporarily unmaps (or more likely, protects PROT_NONE) pages during transient periods when they should not be accessed).

The compiler's -d=checkptr mode would check conversions from uintptr to unsafe.Pointer. The conversion may throw at run-time if: the uintptr refers to an invalid (unmapped) nonzero address.

How do we implement this? Add an unconditional dereference of the pointer? Does a PROT_NONE-protected page count as "valid"? If so, a dereference isn't sufficient, you'd need to either try to map over the page, or parse /proc/self/maps.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 21, 2023

Do we need this requirement? I don't think the GC will follow non-Go pointers

I think the bad pointer in Go heap check will trigger if an unsafe.Pointer variable whose storage is itself in Go memory contains a value that is too bogus (like maybe too close to 0, or formerly-but-no-longer part of the Go heap)?

Plus, if the non-Go allocator does actually unmap the address (instead of mapping it PROT_NONE), then it's possible the Go allocator will later make use of that address while the pointer still exists.

I would be ok with weakening the requirement to “must remain mapped”, but I'm not sure that I want to try to define “mapped” portably. 😅

@bcmills
Copy link
Contributor Author

bcmills commented Feb 21, 2023

How do we implement this?

Heavy emphasis on the “may”?

I was thinking we would call runtime.findObject and let it do its thing: if it calls badPointer so be it, and if it reports a nonzero base we fail with a checkptr error.

@ianlancetaylor
Copy link
Contributor

At first glance updating vet for this means simply permitting all conversions from uintptr to unsafe.Pointer, because in general vet has no idea whether the value is valid. Can we do better?

@ianlancetaylor
Copy link
Contributor

"This address must remain valid" seems both too strong and not strong enough. What actually matters is that the address is never allocated by the Go runtime. I'm not sure how to phrase that, though.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 22, 2023

At first glance updating vet for this means simply permitting all conversions from uintptr to unsafe.Pointer

I agree that's the simplest change to make to it. Part of the intent is to shift the burden of enforcement from a relatively noisy static check (unsafeptr) to a more precise dynamic one (checkptr, which is enabled by default under the race detector).

It is possible in theory for vet to continue to notice and flag cases where the uintptr is always derived from a pointer to Go memory, but I'm not sure it would be worth the effort to implement given that checkptr could detect those cases with very high precision.

Also, the large number of vet warnings in x/sys suggests that the existing vet check is probably not widely used today. I'm not sure that we would lose much in gutting it.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 22, 2023

What actually matters is that the address is never allocated by the Go runtime. I'm not sure how to phrase that, though.

Maybe:
“A uintptr containing a memory address allocated outside of Go
may be converted to Pointer. The address must remain unavailable
to the Go runtime (for example, due to an allocation using cgo or
syscall.Mmap) for as long as any Go pointer (of type Pointer
or any other pointer type) refers to it.”

@mknyszek
Copy link
Contributor

The address must remain unavailable to the Go runtime (for example, due to an allocation using cgo or syscall.Mmap) for as long as any Go pointer (of type Pointer or any other pointer type) refers to it.

Tying these two things together worries me. Having a program be incorrect if it leaves behind a pointer it never dereferences to some memory region that was unmapped and remapped by the Go runtime feels too subtle to me. I'm also not sure how to implement a dynamic check for this case since there's no way the runtime could identify that a pointer just sitting there previously referred to non-Go memory.

This is something of an insight from the arena proposal: it's really not safe to just unmap memory regions that Go pointers reference. The arena implementation waits until the GC can't find a Go pointer to the region before allowing that address space to be reused.

One way we can make this OK is to support GC-managed memory regions, where the GC will let you know (via a finalizer or similar mechanism) that it's really OK to unmap a region. (Side note: this is actually fairly easy to implement today, if we require/guarantee that memory regions are aligned to 8 KiB, and are at least 8 KiB in size. Unfortunately our most popular platforms all have 4 KiB physical pages...)

@bcmills
Copy link
Contributor Author

bcmills commented Feb 22, 2023

Having a program be incorrect if it leaves behind a pointer it never dereferences to some memory region that was unmapped and remapped by the Go runtime feels too subtle to me.

That's why it's unsafe! 😅

I'm also not sure how to implement a dynamic check for this case since there's no way the runtime could identify that a pointer just sitting there previously referred to non-Go memory.

If the GC rate is high enough, eventually the pointer will be unallocated when the GC scans it and trip the badPointer check, no?

This is something of an insight from the arena proposal: it's really not safe to just unmap memory regions that Go pointers reference.

Agreed, but I expect that in the vast majority of cases, the addresses converted from uintptr to unsafe.Pointer will never be unmapped and it will be a non-issue.

On the other hand, to the extent that it is a problem under this proposal, it is already a problem under cgo in general today: suppose that a C function allocates a memory region using mmap, passes that pointer into a C-exported Go function (perhaps as type void* a.k.a. unsafe.Pointer), and unmaps it after the Go function returns. That's exactly the same problem without involving uintptr conversions, no? If the runtime happens to acquire that address in a subsequent allocation, any lingering pointers may trigger the badPointer check during the next GC cycle.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 22, 2023

(CC @dominikh @timothy-king)

@mknyszek
Copy link
Contributor

mknyszek commented Feb 22, 2023

That's why it's unsafe! 😅

Haha, yeah I suppose so. :)

If the GC rate is high enough, eventually the pointer will be unallocated when the GC scans it and trip the badPointer check, no?

That's probably true. There are a few different failure modes and outcomes possible in today's GC (unclear what it might look like under different GC implementations). There's just varying levels of unsafety here and since this is one of those things that relies on pointers existing, so it's especially hard to track down the issue from just the error you get. Maybe that's just what we're willing to say is what happens with cgo and users should defensively nil out references to any non-Go mappings before unmapping them.

Agreed, but I expect that in the vast majority of cases, the addresses converted from uintptr to unsafe.Pointer will never be unmapped and it will be a non-issue.

Also probably true. I guess at the very least I'd argue for some strongly worded advice pointing out that this is really the only case that's straightforward to reason about, until we have some way for the GC to notify code that some memory region is no longer referenced.

On the other hand, to the extent that it is a problem under this proposal, it is already a problem under cgo in general today: ...

That's a good point. Though, I'd argue we should just offer a solution to that problem (independently of this proposal).

Lastly, I'm paranoid that we're codifying a constraint on GC implementations with this. It's ensuring that the GC always has to be somewhat tolerant of what actually is in a pointer slot. The runtime definitely relies on this property internally in the current implementation, but how does it restrict future implementations? And maybe I'm wrong and we've already codified this property, making it a moot point? I don't know yet either way.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 22, 2023

Maybe that's just what we're willing to say is what happens with cgo and users should defensively nil out references to any non-Go mappings before unmapping them.

FWIW, as far as I can tell this is exactly analogous to the cgo pointer-passing rules: “C code may not keep a copy of a Go pointer after the call returns.” It just happens to run in the other direction too: if a C-like program passes a pointer to a Go function, the Go code may not keep a copy of the C pointer either. Fair's fair, after all! 🙃

I'm paranoid that we're codifying a constraint on GC implementations with this. It's ensuring that the GC always has to be somewhat tolerant of what actually is in a pointer slot. … And maybe I'm wrong and we've already codified this property, making it a moot point?

Between cgo and syscall.Mmap, I think we already have to have that tolerance, and I think it's too ingrained to ever remove at this point.

It may be that we could get by with requiring that the uintptr is actually a mapped address (not just “outside the Go heap and sufficiently far from zero”), but as @prattmic noted that may be harder to enforce precisely.

mrhdias added a commit to mrhdias/vte that referenced this issue Mar 4, 2023
@rsc
Copy link
Contributor

rsc commented Apr 5, 2023

If we say "uintptr to unsafe.Pointer is OK for non-Go pointers" then I think there's almost no way that vet could tell with any precision when to report an error. Any pointer being converted to uintptr and back might have come from C. The only clear case to flag would be:

x := uintptr(unsafe.Pointer(&v))
x += 2
p := unsafe.Pointer(x)

but that doesn't seem very common.

Is there something we can do to flag bad code in the vet check?

Perhaps we should run the unsafeptr vet check across the whole open source ecosystem?

@bcmills
Copy link
Contributor Author

bcmills commented Apr 5, 2023

If we say "uintptr to unsafe.Pointer is OK for non-Go pointers" then I think there's almost no way that vet could tell with any precision when to report an error.

Agreed — for vet to report an error it would have to be able to prove that a pointer may alias a Go allocation.

Is there something we can do to flag bad code in the vet check?

I think we should more-or-less give up on the vet check. It has so many false positives today that I doubt anyone uses it in codebases of significant size.

More importantly: it should be straightforward to implement a much more precise checkptr check for these conversions, and checkptr is already enabled by default in -race mode. So we would lose (static, imprecise) coverage from vet, but gain a comparable amount of (dynamic, precise) coverage from checkptr.

@rsc
Copy link
Contributor

rsc commented Apr 6, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rski
Copy link

rski commented Apr 13, 2023

Alternatives

The vet warning can be worked around today by relying on a liberal reading of the “equivalent memory layout” rule, rewriting

Fwiw, I worked around this in the past with a module that just has

func IntoUnsafePointer(v uintptr) unsafe.Pointer {
	return unsafe.Pointer(v)
}

Then in my main module I have vethack.IntoUnsafePointer(v).

Perhaps the go stdlib could provide something like unsafe.UnsafePointerFromNonGoPointer or a better named function? I could also imagine a runtime check at some point to assert that this pointer is not managed by the go gc

@nightlyone
Copy link
Contributor

Another way to solve the issue would be some kind of address space or memory tagging that makes sure that Go can still be a memory safe language and more of such hacks to make it even less memory safe can be avoided.

That would allow using normal structs again and track the data path in static verification tools and maybe dynamic runtime parts to support this.

I am aware that this might lead to a counter proposal, but want to know if that line of thought has been considered as I was unable to find that in this discussion.

@bcmills
Copy link
Contributor Author

bcmills commented Apr 21, 2023

Another way to solve the issue would be some kind of address space or memory tagging that makes sure that Go can still be a memory safe language and more of such hacks to make it even less memory safe can be avoided.

@nightlyone, sorry, but I don't see how that's relevant to this proposal.

Existing system calls and existing C functions may return pointers to parts of the address space outside of what is managed by the Go runtime. No change to the Go runtime or toolchain can change that fact.

The checkptr dynamic check may already look up addresses managed by the Go runtime, check the runtime's metadata about them, and flag misuses of unsafe.Pointer based on that metadata. Part of this proposal is to do so more aggressively, rather than relying on a noisier vet check to flag those misuses.

@mdempsky
Copy link
Member

The conversion may throw at run-time if: [...] the uintptr refers to an invalid (unmapped) nonzero address.

I'm not convinced this should fail. Whether an address is valid is a dynamic property that can change over the lifetime of the address space, whereas the same Go pointer value could remain live across those changes. It seems like a semantic mismatch to force validation at the initial conversion to unsafe.Pointer, if we're not going to re-validate them on address space changes.

I was thinking we would call runtime.findObject and let it do its thing: if it calls badPointer so be it, and if it reports a nonzero base we fail with a checkptr error.

FWIW, I believe this is what checkptr does already, unless I misunderstand what you mean: https://github.com/golang/go/blob/master/src/runtime/checkptr.go#L50

@bcmills
Copy link
Contributor Author

bcmills commented May 2, 2023

I believe this is what checkptr does already

Neat! Then that part is already taken care of. 😃

I guess under this proposal we would have to relax the check for pointers near 0, though:

if 0 < uintptr(p) && uintptr(p) < minLegalPointer {
throw("checkptr: pointer arithmetic computed bad pointer value")
}

(Instead, perhaps we could check that if each of the originals has a nonzero base, then the result must also have a nonzero base?)

@bcmills
Copy link
Contributor Author

bcmills commented May 2, 2023

Whether an address is valid is a dynamic property that can change over the lifetime of the address space, whereas the same Go pointer value could remain live across those changes.

Agreed. I'm leaning toward the variation in #58625 (comment); I'll update the top comment to use that.

@rsc
Copy link
Contributor

rsc commented May 3, 2023

Rewording again:

“A uintptr containing a memory address allocated outside of Go may be converted to Pointer. Any such Pointers must no longer be reachable from live Go values before that memory can be returned to its allocator."

It is unclear how the runtime can actually guarantee this - there are difficult implementation questions to eliminate all possible races. But it seems like the above is the model we'd want to give users, if we were going to guarantee something.

Given the complexities here, maybe we don't want to guarantee anything? What are the valid use cases for this other than mmap? We could say code should use syscall.Mmap, which does the conversion itself in code we control.

@bcmills
Copy link
Contributor Author

bcmills commented May 3, 2023

What are the valid use cases for this other than mmap?

Mostly cgo: if you call a C function that returns a uintptr and you know that it is the address of something in the C heap, you should be able to convert that uintptr to the corresponding Go pointer type.

It also appears to be needed for a couple of other system calls that allocate memory in the caller's address space when they are invoked through syscall.Syscall or equivalent, such as shmat(2) on POSIX platforms, or LocalAlloc or LockResource on Windows.

There are also a number of Windows C libraries that are loaded and called dynamically, such as GetSidIdentifierAuthority in x/sys/windows.

There are also some uses in x/sys that convert uintptr arguments passed to Windows callbacks, such as in ctlHandler. It's not obvious to me whether those could be replaced with typed pointer arguments.

Finally, these conversions could be used for relocations on platforms with predictable memory layouts (such as the github.com/ebitengine/purego usage reported in #56487), and for machines with memory-mapped I/O devices (perhaps certain embedded devices, particularly for ARM and MIPS?).

@rsc
Copy link
Contributor

rsc commented May 24, 2023

@aaronbee thanks, added #60409

@rsc rsc changed the title proposal: unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory proposal: unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory May 24, 2023
@gngpp

This comment was marked as off-topic.

@database64128
Copy link
Contributor

what is the amount of overhead you are seeing from the mmapper? Are you concerned about the lock or the map operations? When I wrote that I expected Mmap calls would be infrequent, certainly not often enough to make a trivial lock+map+unlock a bottleneck.

Thank you for asking these follow-up questions.

Yes, you're right that mmap calls are usually infrequent and mmapper is unlikely to become a bottleneck. So I completely understand if you decide to keep mmapper.

I initially wrote my own mmap wrappers because:

  • I took a look at the mmapper code and really disliked the approach. Just because we could afford a few extra sequentially consistent atomic operations, doesn't mean we should.
  • mmapper can be trivially bypassed by reslicing like a[low : high : max], which I do use in part of my code.
  • My code had to support Windows, and I was unwilling to replicate mmapper's behavior in my own code.
  • I wanted to learn more about mmap, and the best way seemed to be implementing my own syscall wrappers.
  • munmap actually allows you to unmap only part of a mapped address space. Then theoretically you could fill that space again by mmaping other stuff into it. This is not supported by mmapper.

database64128 added a commit to database64128/shadowsocks-go that referenced this issue Jun 20, 2023
Since upstream is unwilling to drop mmapper or desensitize unsafeptr, I decided to take matters into my own hands by switching to even more "unsafe" conversions to get rid of the useless warning. Kind of ironic, isn't it?

golang/go#58625
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

It seems like we should try to make the unsafeptr check accurate, perhaps by ignoring uintptr conversions made on results from syscall.Syscall and maybe also x/sys/windows.Proc.Call (that wraps syscall.Syscall so maybe it's not a special case if we flow information up like we do in the printf checker). If we can make unsafeptr accurate, then that can be the rules we write down too. And I think we'd just say runtime is very special and never going to meet the unsafeptr rules and just have unsafeptr disable itself on runtime.

Does anyone want to try to improve the unsafeptr precision?

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

Here are the results of running the unsafeptr check on the latest version of all public modules stored on proxy.golang.org that are required by >= 1 other modules (this filters out many forks). https://gist.github.com/rsc/d56cf40e010112f67339340101f92f17

A few observations:

Seeing these results, it seems to me that fixing the unsafeptr check to be precise is a hopeless cause. We simply cannot decide statically.

However, among the many false positives, there are real problems unsafeptr turns up, like converting an integer off the network to a channel or even subtle misuses of reflect that store uintptr-typed pointers onto the stack where the GC will not see them. We want to preserve some way to find and report these. We have a way to do this today, namely -d=checkptr=1, which checks unsafe.Pointer(u) conversions and is enabled automatically in -race and -msan builds. Because it is a dynamic check, it can permit code to synthesize non-Go pointers but then block the exact same code and call stack from synthesizing a Go pointer.

If we can make that unsafe.Pointer(u) check very cheap, say under 10X the cost of an array bounds check, then perhaps it would make sense to enable it always, not just with -d=checkptr=1. Based on discussion with @mknyszek, @mdempsky, @aclements, and others, it seems like we probably can make the check cheap enough to run in ordinary production use. I think we should explore that. Specifically, I suggest:

  1. In the spec's unsafe.Pointer rules, add a new rule allowing any of the existing uintptr->unsafe.Pointer cases to include the addition of an offset within the same allocation, addressing a common false positive case above.
  2. Also add a new rule allowing conversion of a uintptr to unsafe.Pointer when it creates a pointer to non-Go memory that will only be used to access non-Go memory. (In particular x := (*[1<<63-1]byte)(unsafe.Pointer(1)) is OK and i := uintptr(unsafe.Pointer(&z)) is OK, but x[i-1] is not.
  3. Improve the speed of -d=checkptr=1 to the point where it can be enabled at minimal cost in production code. This will also require the compiler to recognize idioms like the body of noescape and elide the check in those functions. Anywhere the compiler elides the check, that should be because the compiler avoided creating a uintptr at all, keeping the pointer pointer-typed the whole time.
  4. Work to ensure that the vast majority of code in the ecosystem runs fine with -d=checkptr=1.
  5. Enable -d=checkptr=1 by default, controlled by a GODEBUG.
  6. Delete the unsafeptr vet check.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

For this proposal issue, I think all we need to agree on is (1) and (2) in my previous comment, namely the additions to the spec's rules for unsafe.

@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@cherrymui
Copy link
Member

cherrymui commented Jul 5, 2023

In particular x := (*[1<<63-1]byte)(unsafe.Pointer(1)) is OK and i := uintptr(unsafe.Pointer(&z)) is OK, but x[i-1] is not.

I'm not sure I understand this part, or at least how we're going to implement it. x[i-1] (syntactically) doesn't include an unsafe conversion. How are we going to disallow it (both in the spec and at run time)?

Or should we disallow conversion like x := (*[1<<63-1]byte)(unsafe.Pointer(1)) if it overlaps with Go memory?

@bcmills
Copy link
Contributor Author

bcmills commented Jul 5, 2023

Or should we disallow conversion like x := (*[1<<63-1]byte)(unsafe.Pointer(1)) if it overlaps with Go memory?

I think we should do that. (I thought -d=checkptr already flagged that today, in the converted pointer straddles multiple allocations check?)

@rsc
Copy link
Contributor

rsc commented Jul 12, 2023

In general we can't implement it. We can implement the specific case of x := (*[1<<63-1]byte)(unsafe.Pointer(1)) but I don't think we're going to instrument unsafe.Add(p, N). So if you did x := (*byte)(unsafe.Pointer(1)) and unsafe.Add(x, Ptr-1) I don't think we're going to catch that.

@bcmills
Copy link
Contributor Author

bcmills commented Jul 12, 2023

I agree that we shouldn't instrument unsafe.Add by default, but maybe it would be reasonable to do in -d=checkptr mode?

@mdempsky
Copy link
Member

  1. In the spec's unsafe.Pointer rules, add a new rule allowing any of the existing uintptr->unsafe.Pointer cases to include the addition of an offset within the same allocation, addressing a common false positive case above.

If I understand correctly, this means that if unsafe.Add(unsafe.Pointer(x), 1) is valid today, then users would be able to write it as unsafe.Pointer(x + 1) instead? That seems reasonable to me.

Caveat: This isn't actually always safe today with cmd/compile. For example, compiling https://go.dev/play/p/SWa3KGxqKPk with -live shows a live temporary at line 18, but not 11. I think that should be easy to fix though.

[2.] In particular x := (*[1<<63-1]byte)(unsafe.Pointer(1)) is OK and i := uintptr(unsafe.Pointer(&z)) is OK, but x[i-1] is not.

Can you elaborate what you mean that x[i-1] is not okay? Specifically, what obligations are on the implementation to provide when compiling that expression? E.g., is the compiler required to conservatively assume x[i-1] and &z may alias?

Note: An expression like uintptr(unsafe.Pointer(&z)) doesn't force z to be heap allocated. If it's stack allocated and the stack moves between assigning i and evaluating x[i-1], they may not alias. If this is an issue, we could force z to the heap when we lose track of what happens to the uintptr result so that it has a stable address.

[3.] Improve the speed of -d=checkptr=1 to the point where it can be enabled at minimal cost in production code.

Do you have specific expectations of what -d=checkptr=1 should/must catch? Or is it just whatever can be done without false positives and at extremely low overhead?

@rsc
Copy link
Contributor

rsc commented Jul 12, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory Jul 12, 2023
@rsc rsc modified the milestones: Proposal, Backlog Jul 12, 2023
@mknyszek
Copy link
Contributor

This ended up in the compiler/runtime triage queue because the proposal was accepted, but we have some questions about what exactly the actions here are.

I think we're clear on what has to happen for (1), but we're less clear on the requirements for the rest (i.e. how is the compiler implementation restricted, and what should -d=checkptr=1 be looking for exactly? Or is it as simple as "whatever it can with low overhead"? (@mdempsky asked about both of these above.)

@rsc
Copy link
Contributor

rsc commented Jul 19, 2023

If I understand correctly, this means that if unsafe.Add(unsafe.Pointer(x), 1) is valid today, then users would be able to write it as unsafe.Pointer(x + 1) instead? That seems reasonable to me.

Caveat: This isn't actually always safe today with cmd/compile. For example, compiling https://go.dev/play/p/SWa3KGxqKPk with -live shows a live temporary at line 18, but not 11. I think that should be easy to fix though.

SGTM thanks.

[2.] In particular x := (*[1<<63-1]byte)(unsafe.Pointer(1)) is OK and i := uintptr(unsafe.Pointer(&z)) is OK, but x[i-1] is not.

Can you elaborate what you mean that x[i-1] is not okay? Specifically, what obligations are on the implementation to provide when compiling that expression? E.g., is the compiler required to conservatively assume x[i-1] and &z may alias?

By "not OK" here I just meant it's an invalid program. I did not mean that the compiler has to catch it or enable the runtime to catch it. We could imagine disallowing the (*[1<<63-1]byte)(unsafe.Pointer(1)) because it spans the Go address space, but I don't think that's terribly useful since you can still do

x := unsafe.Pointer(1) // OK
i := uintptr(unsafe.Pointer(&z)) // OK
y := unsafe.Add(x, i-1) // oops created a Go pointer

This program is invalid but I don't think we should instrument unsafe.Add to look for transitions from "not Go" argument to "Go" result. unsafe.Add should probably still stay a simple add instruction. We can't actually catch all misuse of unsafe. If someone is determined, they will find a way to write something that defeats checkptr. For example the checkptr implementation is not going to catch you doing:

var x unsafe.Pointer
*(*uint64)(unsafe.Pointer(&x)) = myInteger
use(x)

which probably works most of the time. :-)

Note: An expression like uintptr(unsafe.Pointer(&z)) doesn't force z to be heap allocated. If it's stack allocated and the stack moves between assigning i and evaluating x[i-1], they may not alias. If this is an issue, we could force z to the heap when we lose track of what happens to the uintptr result so that it has a stable address.

Understood. It's not an issue because it's not a valid program.

[3.] Improve the speed of -d=checkptr=1 to the point where it can be enabled at minimal cost in production code.

Do you have specific expectations of what -d=checkptr=1 should/must catch? Or is it just whatever can be done without false positives and at extremely low overhead?

The main thing is to catch honest mistakes, like:

u := reflect.ValueOf(p).Pointer()
y := (*T)(unsafe.Pointer(u))

or similar code with struct offsets added to u, say. (The mistake is that u has type uintptr so it's not going to be updated if p moves or stop p from being GC'ed.)

Whatever can be done without false positives and at extremely low overhead sounds good. -d=checkptr=1 can still do more if there's something important to keep doing with more overhead. The low overhead one would be enabled for -d=checkptr=0.

database64128 added a commit to database64128/cubic-rce-bot that referenced this issue Nov 18, 2023
Since upstream is unwilling to drop mmapper or desensitize unsafeptr, I decided to take matters into my own hands by switching to even more "unsafe" conversions to get rid of the useless warning. Kind of ironic, isn't it?

golang/go#58625
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Projects
Status: Accepted
Development

No branches or pull requests