Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: unsafe: add Data for arbitrary type casts #64843

Closed
gohryt opened this issue Dec 22, 2023 · 18 comments
Closed

proposal: unsafe: add Data for arbitrary type casts #64843

gohryt opened this issue Dec 22, 2023 · 18 comments
Labels
Milestone

Comments

@gohryt
Copy link

gohryt commented Dec 22, 2023

Proposal Details

Author
Consider myself as intermediate go developer.
Experience: 3 years in production with go.
Other language experience: the most experience is in Js, tried many interesting things from LLVM IR to Nim and Odin.

Related proposals
Has this been proposed before? No.
Affect error handling: no.

Would this change make Go easier or harder to learn, and why?
This will not affect most of go developers. This will not be mandatory-to-use.

Proposal

Add a new virtual type "Data" to unsafe package. Currently we have .Pointer and the real world example of it's usage is:

package main

type Coordinates struct {
	X int
	Y int
}

func As[T_as, T_from any](from *T_from) *T_as {
	return (*T_as)(unsafe.Pointer(from))
}

func main() {
	coordinates := Coordinates{X:10, Y:20}
	xy := *As[[2]int](&coordinates)
	
	fmt.Println(xy[0], xy[1])
}

it's solution which we have now, not best solution. Proposed .Data type should work as in example below:

package main

type Coordinates struct {
	X int
	Y int
}

func main() {
	coordinates := Coordinates{X:10, Y:20}
	xy := [2]int(unsafe.Data(coordinates))
	
	fmt.Println(xy[0], xy[1])
}

This solution will make our code cleaner, more understandable for normal go devs and will allow some conversions like as in case when we have two equal structures with different print (json, yaml) tags.

This may be too permissive for golang code so i think we should restrict it with size check - conversion should be succesful only if both's .Sizeof is equal.

Backward compatibility:
Full because of no "Data" keyword in package unsafe.

Cost of this proposal:
Tools to rework: compiler
Compile time cost: no cost
Runtime cost: no cost

Possible implementation:
Just as .Pointer

@gopherbot gopherbot added this to the Proposal milestone Dec 22, 2023
@gohryt gohryt changed the title proposal: unsafe: add .Data, allow function to .Pointer conversion proposal: unsafe: add .Data Dec 22, 2023
@Jorropo
Copy link
Member

Jorropo commented Dec 22, 2023

Have you considered writing this once:

func As[T, F any](from F) T {
	return *(*T)(unsafe.Pointer(&from))
}

And then using the "clean" As function ?


Also I think it is good if unsafe's syntax looks bad, it shouldn't be used in almost all situations.


How the implementation would look like ?
Because I don't see a clean way to handle cases like:

type S struct{
 x uint
 a, b, c unsafe.Data
 y uint
}

How is the compiler supposed to know how to layout S ?

I could see the compiler always rewriting this to unsafe.Pointer (which works because unsafe.Pointer's alignment and sizes are always known and don't depend on the thing inside) but then I don't understand the value, it wouldn't save memory allocations or improve the layout of structs.

@seankhliao seankhliao changed the title proposal: unsafe: add .Data proposal: unsafe: add Data for arbitrary type casts Dec 22, 2023
@adonovan
Copy link
Member

What kind of type is unsafe.Data? It's not a pointer; it's a (C++-like) reference to the variable pointed to by an unsafe.Pointer, which has unknown size and other characteristics. Adding a new fundamental data type (which means a new reflect.Kind, breaking all existing reflect.Type switches) would be a major language change just to eliminate a tiny amount of conversion in code that is doing something unsafe. That's clearly the wrong choice.

If you're using unsafe, the conversion is the least of your worries. Follow @Jorropo's suggestion and write a simple helper function.

@apparentlymart
Copy link

One way to think of unsafe.Pointer is as being a pointer to something of unknown type, and therefore unknown layout. It's okay for something like that to exist behind a pointer because the compiler doesn't need to know anything other than the platform's pointer size in order to pass that around.

This Data seems to instead be a direct value of unknown size and layout, and therefore it isn't clear to me how it could be used for any operations at all, because we can't store an unknown-sized value directly in the stack, or in cpu registers, etc.

Reading through the proposed solution to infer the underlying problem, it seems like the main thing you want is the ability to interpret an arbitrary block of memory as if it were some specific type. Rust has such an operation, which it calls "transmute". It is designed as a generic function which takes a value of any type (but the type must be known) and reinterprets its memory representation as some other specified type.

The As function in the opening example is effectively the same as Rust's transmute except that it lacks a compile-time check that the input and output types have the same size, so it's actually more like transmute_unchecked, which is in Rust but I think not exposed for end-program use.

Do you think that including something with the same API and implementation as your As function in the standard library as part of package unsafe would meet your need? That seems like a narrower solution whose behavior and constraints can therefore be more clearly specified. Your own program would then just be the call to that function, and so you won't need to worry about how ugly the implementation of that function looks.

Without special compiler support I don't think it would be possible to enforce that both types are the same size at compile time, but since we're in package unsafe I think it would be sufficient for the documentation to specify that:

  • The source and destination types must both have a layout guaranteed by the specification, or else the caller is depending on details of the current implementation.
  • The source and destination types must be of the same size.
  • The in-memory representation of the source value must also somehow be a valid value of target type.

If any of the above do not hold, the result is undefined.

@gohryt
Copy link
Author

gohryt commented Dec 24, 2023

"As" was just example of operation which we need sometimes. I'm not good in rust, but it's transmute seems like solution for problems i'm raising hare.

There are many cases where this operation may simplify our code, as example i worked on project (it was't soluction created by me) where some data was stored in byte slice and X begining bytes was used for meta information of different types. There were integers of different size and sometimes structures. Something like that:

x := make([]byte, 8)
_ = (*int64)(unsafe.Pointer(unsafe.SliceData(x)))

but much more uglier because there was not unsafe.Slice Data in stdlib when i face it

@Jorropo
Copy link
Member

Jorropo commented Dec 24, 2023

@gohryt imo this is bad code and you should write int64(binary.NativeEndian.Uint64(x)), it is optimized by the compiler, just as fast (compiles to one load) but is memory safe (can't accessibly access elements outside of the slice).

But that aside, I don't understand how unsafe.Data would help here, unsafe.SliceData is a pointer.
from what I understand of your proposal using unsafe.Data here would interpret the pointer itself as an int64, which is exactly the same as unsafe.Pointeruintptr conversion without memory safety:

x := int64(uintptr(unsafe.Pointer(unsafe.SliceData(x))))

"As" was just example of operation which we need sometimes. I'm not good in rust, but it's transmute seems like solution for problems i'm raising hare.

Can you give an other example of operation where unsafe.Data would be useful that As can't do ?
I don't believe there is one.

@dr2chase
Copy link
Contributor

Is the proposal here to invent a way to treat the fields of a structure as if you knew how they were laid out in memory?
I would very much like (as in, it is something I hope to try this year) reorder fields to save space and reduce the "span" of pointers within data, except in those cases where it is explicitly requested otherwise (e.g., for cgo, swig, that sort of thing).

This proposed change would affect all the developers who were deprived of the benefit of this optimization, and required (in many cases) to do by hand an optimization that a computer could do better and faster.

@dominikh
Copy link
Member

Is the proposal here to invent a way to treat the fields of a structure as if you knew how they were laid out in memory?
I would very much like (as in, it is something I hope to try this year) reorder fields to save space and reduce the "span" of pointers within data, except in those cases where it is explicitly requested otherwise (e.g., for cgo, swig, that sort of thing).

There is a non-insignificant amount of unsafe Go code that relies on the current field order and padding. I'm also curious how explicit "explicitly requested" is. Currently, we might use cgo in one package, but pass struct types from another package to cgo.

@rsc
Copy link
Contributor

rsc commented Jan 24, 2024

If unsafe.Data is a type, what does 'var x unsafe.Data' mean?

@rsc
Copy link
Contributor

rsc commented Jan 26, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@gohryt
Copy link
Author

gohryt commented Jan 26, 2024

If we really need it as castable type it may be like array of size in bytes.
But as was mentioned above it may be transmute-like function which casts one object of size X to another of such size.

@qiulaidongfeng
Copy link
Contributor

It seems to me that the proposed unsafe.Data is an unsafe.Pointer with a type conversion syntax sugar.

@qiulaidongfeng
Copy link
Contributor

It seems that the purpose of unsafe.Data is to reduce the number of characters typed in type conversions compared to unsafe.Pointer.

var x unsafe.Data it's a few fewer characters than writing "var x unsafe.Pointer" in convert type.

For example in #64843 (comment)

@gohryt
Copy link
Author

gohryt commented Jan 26, 2024

@qiulaidongfeng
Are you saying that you have never encountered something like

/models/user.go

type User struct {
... // many many fields
}

/entities/user.go

type User struct {
... // copy of fields above
}

in your work?

@randall77
Copy link
Contributor

@gohryt In that case you could write, in package entities,

type User models.User

or perhaps

type User = models.User

@gophun
Copy link

gophun commented Jan 26, 2024

@qiulaidongfeng Are you saying that you have never encountered something like

/models/user.go

type User struct {
... // many many fields
}

/entities/user.go

type User struct {
... // copy of fields above
}

in your work?

You don't need an unsafe cast to convert one to the other: https://go.dev/play/p/npTCvR8_kcE

@Jorropo
Copy link
Member

Jorropo commented Jan 26, 2024

@gohryt even if you really want to use unsafe because of import cycles or something (the solution would be to move the definition to a third package)

func Transmute[T, F any](from F) T {
 return *(*T)(unsafe.Pointer(&from))
}

// ...
u := Transmute[entities.User, models.User](u)

@rsc
Copy link
Contributor

rsc commented Feb 1, 2024

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Feb 8, 2024

No change in consensus, so declined.
— rsc for the proposal review group

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Declined
Development

No branches or pull requests