Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: define return statement's result assignment order? #58233

Closed
bradfitz opened this issue Feb 1, 2023 · 26 comments
Closed

proposal: spec: define return statement's result assignment order? #58233

bradfitz opened this issue Feb 1, 2023 · 26 comments

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Feb 1, 2023

A coworker sent out a code review with:

func As(err error) (_ Error, ok bool) {
	var e Error
	return e, errors.As(err, &e)
}

... and during review I said I wasn't sure whether it works. Turns out it worked but it's undefined. It happens to work in cmd/compile and doesn't work in gccgo.

The spec defines assignment order for e.g. a, b = foo(), bar() or a, b = b, a but not for return.

Another example which has different results between gc and gccgo: https://go.dev/play/p/SDMlczFBshC

I propose the spec defines this. My preference would be left to right assignment to results.

But admittedly it might break code like above. (which we fortunately rewrote to be simpler and explicit)

/cc @ianlancetaylor @griesemer (sorry)

@ianlancetaylor
Copy link
Contributor

CC @mdempsky

@griesemer
Copy link
Contributor

I'd think that returns work like assignments (of the result values to the result variables) and thus should behave the same way with respect to evaluation order. (Haven't investigated the spec yet.)

@griesemer griesemer self-assigned this Feb 1, 2023
@mdempsky
Copy link
Member

mdempsky commented Feb 2, 2023

Note that even x, y = e, f(&e) is undefined as to whether e is evaluated before or after the f(&e) call.

FWIW, cmd/compile already uses the same logic to handle assignment and return statements.

@go101
Copy link

go101 commented Feb 2, 2023

Related: #27821 (Edit: sorry, it should be #27804)

@beoran
Copy link

beoran commented Feb 3, 2023

Perhaps this should be documented as an implementation restriction? "A compiler may disallow return statements for which the order of evaluation cannot be determined due to side effects." Or something similar?

@go101
Copy link

go101 commented Feb 3, 2023

It can determine for sure, it just doesn't now.

@rsc
Copy link
Contributor

rsc commented Mar 29, 2023

I agree that if you have

var e Error
return e, errors.As(err, &e)

then the spec does not guarantee whether the first result is evaluated before or after the call to errors.As.

But if you have:

var e Error
a, b := e, errors.As(err, &e)
return a, b

then I don't believe the spec defines whether a = e happens before or after the call to errors.As either.

In either case, I don't think that's something we can easily change at this point.

@rsc
Copy link
Contributor

rsc commented Mar 29, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@go101
Copy link

go101 commented Apr 1, 2023

We can use an eval function to assure the order.

func Eval[T any](v T) T {
	return v
}

func As(err error) (_ Error, ok bool) {
	var e Error
	return Eval(e), errors.As(err, &e)
}

[update]: similar to #36449

@ianlancetaylor
Copy link
Contributor

@go101 In my opinion it would be better to not write such confusing code in the first place.

In my opinion we should either fully specify the evaluation order or we should have a vet check to warn about cases in which we both read and write a variable in the same statement. The vet check wouldn't be perfect but it might be able to catch cases like this.

@rsc
Copy link
Contributor

rsc commented Apr 6, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@twmb
Copy link
Contributor

twmb commented Apr 6, 2023

I've used this pattern extensively ever since I learned of it. It did seem odd at first blush. Nowadays it feels similar to an inline defer, if that makes sense.

I'd venture to guess that defining it strictly as left-to-right would break a lot of code; my preference is to officially define the current semantics. I'd be -1 on a vet but mostly because I like the current pattern a lot for very boilerplate HTTP methods (e.g., return body, issuePost(&body)), but I'm not too tied to this either way.

It seems to be that defining the semantics would go a long way towards eliminating confusion. I remember being just as confused by defer having the ability to modify named returns after a return -- but after internalizing that, it's clear and obviously useful.

@ianlancetaylor
Copy link
Contributor

Note that there is not a current semantics. The exact behavior depends on compiler details that can vary from release to release.

@twmb
Copy link
Contributor

twmb commented Apr 6, 2023

Interesting. In that case, I do still recommend officially defining the current behavior as per cmd/compile. Looking through code I have checked out with the rather lazy rg --pcre2 'return (?:[[:alnum:]]+, )*([[:alnum:]]+), .*\([^)]*&\1\b', which isn't comprehensive, shows (besides my own libraries) a few libraries that are vendored in the kubernetes repo as well as the prometheus/client_golang repo:

Vendored in kubernetes, but also their own actual real libraries:
https://github.com/Azure/go-autorest/blob/553a90ae65a6a2b18306fa04d7b1625960c5decb/autorest/utility.go#L64-L65
https://github.com/vmware/govmomi/blob/main/vapi/tags/categories.go#L93-L94

Vendored in kubernetes and not in the library anymore:
https://github.com/kubernetes/kubernetes/blob/master/vendor/github.com/cilium/ebpf/internal/btf/types.go#L603-L604

Prometheus:
https://github.com/prometheus/client_golang/blob/main/api/prometheus/v1/api.go#L894-L895

@ianlancetaylor
Copy link
Contributor

I think that you are suggesting that we define

    return e, errors.As(err, &e)

as being implemented as

  1. Evaluate errors.As(err, &e).
  2. Evaluate e.

The rule would be something like "when multiple expressions appear on the right hand side of an assignment or return statement, first evaluate all the function calls, method calls, and channel operations in left-to-right order. Then evaluate all the other expressions in left-to-right order."

That strikes me as a strange choice to make. Wouldn't it be simpler say "evaluate all expressions in left-to-right order"? If we're going to give a precise definition for this kind of confusing code, shouldn't we aim for the simplest precise definition?

@twmb
Copy link
Contributor

twmb commented Apr 6, 2023

Another way to frame it might be, "when multiple expressions appear on the right hand side of an assignment or return statement, evaluate all operations that can modify an expression in left to right order, and then evaluate all other expressions in left to right order", though it is perhaps less explicit. It seems a bit easier to explain than deferred functions, where the wording is currently "deferred functions are executed after any result parameters are set by that return statement but before the function returns to its caller" (and then a large example to be clearer).

@ianlancetaylor
Copy link
Contributor

Maybe I'm too close to it, but to me those seem like very different things.

A deferred function is executed after the return statement is complete, and after the values have been assigned to the result parameters. That seems clear. There is no ambiguity or confusion about order of evaluation. 1) Evaluate all return expressions (in some order); 2) Assign them all to result parameters; 3) Execute deferred functions.

The order you suggest, which is similar to the one I mentioned, is still more complicated than "evaluate all expressions in left to right order." And my point is really that, if we are going to specify an order of evaluation, why wouldn't we just say "evaluate all expressions in left to right order"? Why would we want to say something that is more complicated than that?

@twmb
Copy link
Contributor

twmb commented Apr 6, 2023

I agree that "evaluate all expressions in left to right order" is much simpler to reason about. Potential points in favor of being more complicated are,

  • People are already relying on this behavior -- albeit, without realizing their code is broken on gccgo and there isn't actually a guarantee their code will work in the future

  • Defining strict left-to-right may result in even more subtle bugs.

If I have

type foo struct {
    copyByVal  string
    copyByAddr map[string]bool
}

func modify(f *foo) bool {
    f.copyByVal = "modified"
    f.copyByAddr["modified"] = true
    return true
}

func problem() (foo, bool) {
    f := foo{copyByAddr: make(map[string]bool)}
    return f, modify(&f)
}

With the current behavior, foo is consistently deeply modified before being returned. With the proposed strict left-to-right, fields in a struct that are non-pointers are returned as is, while fields that are pointers are still modified, resulting in some weird half state modification.

However, defining a vet, or more strictly, refusing to compile this type of return would be a strong signal to developers that they can't do this anymore (and will force ecosystem library churn)

@twmb
Copy link
Contributor

twmb commented Apr 7, 2023

Broadly speaking, I think this issue can be generalized to this sentence in the spec,
"However, the order of those events compared to the evaluation and indexing of x and the evaluation of y is not specified."
and the section following. The current cmd/compile implementation evaluates all functions first. It may be worthwhile to use this issue to specify all of these types of operations (not just return order evaluation).

@08d2
Copy link

08d2 commented Apr 7, 2023

return f, modify(&f)

Such code seems to me to be pretty clearly buggy, in a similar way to code that relied on map iteration order (prior to the randomization update).

An update to the spec might be "The order of those events ... is not specified and is unpredictable. Code which relies on any specific ordering behavior is incorrect. A future version of Go may codify this unpredictability."

@go101
Copy link

go101 commented Apr 7, 2023

The current cmd/compile implementation evaluates all functions first.

This is not true. The current cmd/compile implementation has several self-inconsistencies. See #36449

@rsc
Copy link
Contributor

rsc commented Apr 12, 2023

We can't adopt the rule "run function calls then evaluate expressions", because then if you have code like

return f(), g()

or

x, y := f(), g()

and you have a tool that inlines all the calls to f() (suppose func f() { return global }) you get

return global, g()

or

x, y := global, g()

Before global was evaluated before g(); now it's evaluated after g(). Of course the rewritten code is undefined today, but if we're going to define it, we should define it consistent with manual inlining, which would be a strict left-to-right order.

Return and assignments also have to match, so if we're going to make a rule, it has to be for both, which is why I showed both above.

All that said, I don't think we are considering redoing evaluation order in assignments today. If someone wants to implement strict left-to-right and measure the performance hit, that would be useful data to open a conversation about evaluation order more broadly.

@twmb
Copy link
Contributor

twmb commented Apr 12, 2023

Am I reading this correctly,

  • Things right now are undefined
  • There's no willingness to define
  • If things were defined, left-to-right is the way to go from a consideration of function inlining
  • There's an ecosystem that relies on the current behavior of cmd/compile that is actually broken with gccgo

Unfortunately it seems like this would've been valuable to specify in the 1.0 spec. Since it's too late for that -- I'd side with @ianlancetaylor's original comment of adding a vet check, to gradually migrate the existing ecosystem away from the currently assumed behavior.

@timothy-king
Copy link
Contributor

In my opinion we should either fully specify the evaluation order or we should have a vet check to warn about cases in which we both read and write a variable in the same statement. The vet check wouldn't be perfect but it might be able to catch cases like this.

Note that errors.As(err, &e) is a may write. So I think a vet check would likely be for whether a variable is read in a different expression in a return's expression list than an expression in the return when a reference to the variable is taken and that reference may be written to. I am guessing this may be written would be highly over-approximate, and my concern for false-positives would be for whether to include all calls to methods with pointer receivers. FWIW what I am describing would warn on all of the examples in #58233 (comment). I think someone would need to try this before we know how reasonable this would be.

@ianlancetaylor
Copy link
Contributor

I agree that we would have to try it out. I do think that we should get a warning on every piece of code in that comment.

@rsc
Copy link
Contributor

rsc commented Apr 12, 2023

No change in consensus, so declined.
— rsc for the proposal review group

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Status: Declined
Development

No branches or pull requests