Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: clarify when calling recover stops a panic #34530

Open
zigo101 opened this issue Sep 25, 2019 · 26 comments
Open

spec: clarify when calling recover stops a panic #34530

zigo101 opened this issue Sep 25, 2019 · 26 comments
Assignees
Labels
Documentation Issues describing a change to documentation. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@zigo101
Copy link

zigo101 commented Sep 25, 2019

What version of Go are you using (go version)?

go version go1.13 linux/amd64

Does this issue reproduce with the latest release?

Yes

What did you do?

package main

import "fmt"

func demo() {
	defer func() {
		defer func() {
			// recover panic 2
			fmt.Println("panic", recover(), "is recovered")
		}()

		// recover panic 1
		defer fmt.Println(" (done).")
		defer recover()
		defer fmt.Print("To recover panic 1 ...")

		defer fmt.Println("now, two active panics coexist")
		panic(2) // If this line is commented out, then the program will not crash.
	}()
	panic(1)
}

func main() {
	demo()
}

What did you expect to see?

Not crash.

What did you see instead?

Crashes.

@zigo101
Copy link
Author

zigo101 commented Sep 25, 2019

If this is the intended behavior, then the descriptions in the spec doesn't cover this case.

@randall77
Copy link
Contributor

@danscales

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 26, 2019
@bcmills
Copy link
Contributor

bcmills commented Sep 26, 2019

Tentatively milestoning for 1.14 on the theory that it may be addressed alongside #34481, but feel free to re-milestone if that is not realistic.

@bcmills bcmills added this to the Go1.14 milestone Sep 26, 2019
@bcmills
Copy link
Contributor

bcmills commented Sep 26, 2019

@go101, personally I would expect this program to crash with panic 1.

Since both of the recover calls are within nested defer statements, I would expect them to only cover panics that originate within the call to the deferred function.

But certainly a clarification either way would be helpful.

@zigo101
Copy link
Author

zigo101 commented Sep 26, 2019

By the explanations for recover mechanism in spec, the following program has no fundamental differences from the above one. But the following one exits without crashing.

package main

import "fmt"

func demo() {
	defer func() {
		// recover panic 1
		defer fmt.Println(" (done).")
		defer recover()
		defer fmt.Print("To recover panic 1 ...")
		
		defer func() {
			// recover panic 2
			fmt.Println("panic", recover(), "is recovered")
		}()

		defer fmt.Println("now, two active panics coexist")
		panic(2)
	}()
	panic(1)
}

func main() {
	demo()
}

My understanding by reading spec is a recover call will recover the panic in the caller of the caller, which must be deferred, of the recover call. Whether or not the recover call itself is deferred is not important.

@danscales
Copy link
Contributor

Yes, the Go language spec is very vague and imprecise on panic/recover and I am hoping to fix it:

https://go-review.googlesource.com/c/go/+/189377

but there's not quite consensus yet.

I don't think there is any disagreement on the behavior of your example (unlike the abort behavior mentioned in #29226 ). A recover only recovers a panic if it is called directly in a defer function that is directly invoked as part of the panicking process of that panic. A recover does not apply and returns nil if it is not called directly in a defer function or if it is called from a defer that was not directly invoked by the panic sequence of the panic (i.e. is nested inside some other defer or function called by the defer).

Your second example seems to be a special little bug/quirk of doing 'defer recover()'. The detection mechanism in the implementation considers that recover is called directly in the containing function, so that recover does apply. Note that recover does not happen if you do

  defer func() {
    recover()
  }

I think the first thing in both these bugs are to get agreement on the spec - both specifying the current behavior that people depend on and/or would generally agree on, and also the behavior (like the abort behavior in #29226 ) that has been there for a while, but has never been specified and people might like to change.

@zigo101
Copy link
Author

zigo101 commented Sep 26, 2019

Yes, this is a little quirk. There is not a way to verify whether or not the recovered value is really 1.
Whatever, I think the behaviors of the above two programs should be always consistent, for there are no fundamental differences between them.

@ianlancetaylor
Copy link
Member

The difference between the two programs has to do with the behavior of defer recover(). The question is simply whether recover is being called by a deferred function. When you write defer recover() in a normal function, it is not (a deferred call to recover is not a case in which recover is called by a deferred function). But when defer recover() occurs within a function that is itself deferred, then it is.

So, in the first program, the defer recover() sees the panic(2), but since it is not called from a deferred function, it does not stop the panic. That panic continues until the "recover panic 2" deferred function, which recovers it. And nothing recovers the panic(1), so the program crashes.

In the second program, the panic(2) is recovered. Then we run the defer recover() in a deferred function. In this case recover is called from a deferred function--the outer deferred function, not the defer recover() function--so that catches panic(1). And the program succeeds.

@ianlancetaylor ianlancetaylor added the Documentation Issues describing a change to documentation. label Sep 27, 2019
@ianlancetaylor ianlancetaylor changed the title runtime: sometimes recover calls fail to work spec: clarify when calling recover stops a panic Sep 27, 2019
@zigo101
Copy link
Author

zigo101 commented Sep 27, 2019

@ianlancetaylor
Thanks for the clarification.

Your explanation makes me more believe that each goroutine maintains a panic stack. Only the top panic in the stack may be recovered. And to be recoverable, the top panic must not be created in runtime.Goexit() calls.

So my current understanding is like this:

Assume the frame depth of the entry call of a goroutine is 1, 
in the execution of the goroutine, when a function call ends 
and the frame depth becomes smaller than the size of the panic stack, 
then the top panic will collapse down (to replace the below panic) 
until the size of stack is the same as the frame depth. 

Each goroutine maintains a panic stack. A panic must be the top one in the
stack and must not be created in a runtime.Goexit call to be recoverable.
A recovered panic will be removed from the stack.

Is the understanding right?

[edit]: there may be some blank slots in the panic stack, or each panic will be associated with a depth property. The depth of a panic can decrease and will never be larger than the current execution frame depth.

@zigo101
Copy link
Author

zigo101 commented Sep 27, 2019

@ianlancetaylor
By your above explanation, I think the current new panic explanations in https://go-review.googlesource.com/c/go/+/189377 still needs a little adjustment.

@ianlancetaylor
Copy link
Member

Sorry, I don't really understand your frame depth discussion.

I encourage you to comment on the CL where you think it needs adjustment.

@mdempsky
Copy link
Contributor

I think this is a really interesting test case. Thanks for reporting it, @go101.

On my phone, so I haven't looked at the Go spec wording or @danscales's suggestion recently, so I'm not sure the right behaviour. From memory though, I can see arguments either way.

@danscales
Copy link
Contributor

Thanks for all the examples and discussion. I missed that there was a defer recover() even in the first example. @ianlancetaylor gives a very good explanation of the current behavior of defer recover(). Overall, the meaning of defer recover() isn't intuitively obvious and its actual behavior with the current implementation is confusing. So, we are thinking about banning the use of defer recover() (since it seems likely to be only useful for test code). Instead of defer recover(), you should always at least use defer func() { recover() }() (and of course, it is always recommended to look at the return code of recover()).

With respect to talking about (and spec'ing out) the behavior of panic/recover/recursive panics, it will be best not to use defer recover(), since that will just create more cases that tend to confuse the other issues.

@mdempsky
Copy link
Contributor

mdempsky commented Oct 2, 2019

So, we are thinking about banning the use of defer recover() (since it seems likely to be only useful for test code).

Are we okay breaking Go 1 compat here? There's at least some code that uses defer recover(); e.g., https://github.com/billziss-gh/cgofuse/blob/master/fuse/host.go#L406

@bcmills
Copy link
Contributor

bcmills commented Oct 2, 2019

Are we okay breaking Go 1 compat here?

There are two changes we can make without breaking compatibility:

  1. We can make defer recover() a vet error everywhere, which would catch most uses during testing but not break uses in transitive dependencies.
  2. We can make defer recover() a compile-time error in any module whose go.mod file specifies go 1.14 or higher, since ~no such module exists yet.

@ianlancetaylor
Copy link
Member

That use of defer recover() in fuse is a bug, right? It is in effect a no-op.

@mdempsky
Copy link
Contributor

mdempsky commented Oct 2, 2019

That use of defer recover() in fuse is a bug, right? It is in effect a no-op.

I believe so.

@zigo101
Copy link
Author

zigo101 commented Oct 3, 2019

Personally, I think the cost of either vet or ban defer recover() is higher than making a little more explanations in spec, not only for compatibility problems, but also for sometimes defer recover() is desired (just like the second example above, it is a have-to. [edit]: there are really some workarounds, such as splitting the outer defer function into two. So the main reason is still to keep compatibility.).

[edit 2]: the validity of the defer recover() in the first example depends on whether or not panic 2 will happen. So vet will make some false positive reports (though this is not a problem for vet).

@zigo101
Copy link
Author

zigo101 commented Oct 3, 2019

On the other hand, for consistency, it is not a bad idea to ban discarding returns of built-in functions. Currently, only the returns of copy and recover can be discarded.

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
MagicalTux added a commit to KarpelesLab/contexter that referenced this issue Jan 18, 2022
@zigo101
Copy link
Author

zigo101 commented Oct 21, 2024

5 years have passed now. Should the spec be made more precise?
Is it good to add a line in the spec: Only the newest unrecovered recoverable panic can be recovered. every recover call is viewed as an attempt to recover the newest unrecovered panic in the current goroutine.

@ianlancetaylor
Copy link
Member

CC @griesemer

@griesemer griesemer self-assigned this Oct 21, 2024
@griesemer griesemer modified the milestones: Backlog, Go1.24 Oct 21, 2024
@griesemer
Copy link
Contributor

Marked for 1.24. Low priority.

@griesemer
Copy link
Contributor

griesemer commented Dec 19, 2024

[edited: some of the observations/explanations here are incorrect - see comment instead]

Some observations:

For the purpose of recovering a panic, whether recover() is called as part of a defer or not doesn't matter: as long as recover() is called directly by the deferred function, it will recover a panic. Per the spec:

The return value of recover is nil when the goroutine is not panicking or recover was not called directly by a deferred function. Conversely, if a goroutine is panicking and recover was called directly by a deferred function, the return value of recover is guaranteed not to be nil.

... and in that case that panic is recovered and the panicking stops.

Note that in defer recover(), recover() is called directly the same way it is called directly in an assignment r := recover(). The point of directly is that the recover() call itself is not within another enclosing function.

Consider the following example of 4 functions a, b, c, d (playground).
Each of these functions is called via protect which simply reports any panics that propagated out of these functions.
Furthermore, a helper function trace is used to document (print) what recover returns.

package main

import "fmt"

func main() {
	protect("protect of a", a)
	protect("protect of b", b)
	protect("protect of c", c)
	protect("protect of d", d)
}

func a() {
	defer func() {
		r := recover() // recovers panic 1
		trace("recover in deferred function of a", r)
	}()
	panic(1)
}

func b() {
	defer func() {
		defer recover() // recovers panic 2
	}()
	panic(2)
}

func c() {
	defer func() {
		r := recover() // recovers panic 3
		trace("recover in deferred function of c", r)
		defer recover() // panic 3 is already recovered - this doesn't do anything!
		panic(4)        // there's no recovery for this panic in c, it is propagated up
	}()
	panic(3)
}

func d() {
	defer func() {
		r1 := recover() // recovers panic 4
		trace("recover in deferred function of d", r1)
		defer func() {
			r2 := recover() // recovers panic 5
			trace("recover in deferred function of deferred function of d", r2)
		}()
		panic(5)
	}()
	panic(4)
}

func trace[P any](msg string, x P) P {
	fmt.Println(msg+":", x)
	return x
}

func protect(msg string, f func()) {
	defer func() {
		trace(msg, recover())
	}()
	f()
}

Running this code produces:

recover in deferred function of a: 1
protect of a: <nil>
protect of b: <nil>
recover in deferred function of c: 3
protect of c: 4
recover in deferred function of d: 4
recover in deferred function of deferred function of d: 5
protect of d: <nil>
  • For a, we have the standard scenario: a panic is recovered via the deferred function.
  • For b, we have the same scenario, but this time the recover() call itself is deferred. Since no panic propagates out of b (protect of b reports nil), the defer recover() successfully recovered panic 2. One could have written defer trace("deferred recover", recover()) for the same effect: recover() is still called directly, even though it is now an argument to trace and deferred.
  • For c, the first recover recovers panic 3, but the 2nd recover - even though deferred - doesn't do anything - and thus panic 4 must be recovered via a deferred function that does the recovering. As a result, panic 4 is propagated out and caught by the protect call.
  • For d, we have the same as in c but now panic 5 is recovered properly.

In short, I believe this is working as intended and the spec is actually already covering this case.
When it comes to clarifications, we may want to be a bit more precisely explain what "calling directly" means, to avoid confusions around defer recover().

@zigo101
Copy link
Author

zigo101 commented Dec 19, 2024

It looks, none of the 4 cases cover the case in the first comment?

@griesemer
Copy link
Contributor

@zigo101 Hm, you are correct, I missed explaining your case, which is essentially like this (playground):

func main() {
	defer func() {
		defer func() {
			recover() // recovers panic 2
		}()

		defer recover() // <<< deferred recover does not recover panic 1
		panic(2)
	}()
	panic(1)
}

Panic 2 is raised before the deferred recover is executed and we now have two panics. The deferred recover cannot recover panic 2 (by definition of recover) and it doesn't "see" panic 1 anymore.

I agree that this should probably be explained in the spec.

@zigo101
Copy link
Author

zigo101 commented Feb 19, 2025

My old comment in #34530 (comment) was not correct. It is modified to: every recover call is viewed as an attempt to recover the newest unrecovered panic in the current goroutine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues describing a change to documentation. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants