-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: rewrite escape analysis #23109
Comments
If I understand correctly, the example assumes there is a type |
The reflooding is added in CL https://go-review.googlesource.com/c/go/+/30693. Its CL description explains it. Probably we should add this as a comment. |
Change https://golang.org/cl/121001 mentions this issue: |
No code changes, only revised comments in an attempt to make escape analysis slightly less confusing. Updates #23109. Change-Id: I5ee6cea0946ced63f6210ac4484a088bcdd862fb Reviewed-on: https://go-review.googlesource.com/121001 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
Over the weekend, I reimplemented a basic variant of esc.go's escape analysis logic, and now I better understand how it works in practice and what the documentation is explaining. One thing I still don't understand though is the handling of recursive functions. In particular, this bit:
It looks like in esccall that we handle recursive calls, including wiring their return values into the flow graph correctly (look for "function in same mutually recursive group. Incorporate into flow graph"). I tried disabling this code by adding This code comes from CL 6741044 (507fcf3). It's true that before that CL we lost track of the return values in mutually recursive function calls, but the CL appears to have added both. So maybe the "if e.recursive" logic was written first, then lvd@ decided to fix the return value tracking, but forgot to remove the original (now unnecessary) fix? |
When I was working on this and thought I understood it, the recursive punt seemed necessary. I also don't know that this is that worthwhile; I took a stab at handling recursion and even had a CL for it that I somewhat trusted, and it avoided practically no heap allocations, so I didn't pursue it. The problem is that if anything escapes, the this-field-versus-that-field precision is lacking, and so everything escapes. I am however very interested in any writeup you can put together, because the algorithm is a real pain to understand. |
So here's the basic idea I understand: Escape analysis treats the control flow graph as a bunch of locations, and assignments between them. Individual assignments can involve addressing or an arbitrary number of dereferences. For example:
Note that Assignments can also be to the "sink" (basically, the Go heap). All Go language constructs can be lowered into this assignment graph. For example,
can be represented as
Assignments to global variables, through pointers, passing arguments to unknown functions, etc. are recorded as assignments to the sink. Also, if q has a greater loop depth than p, then This is an imprecise but conservative representation of data flow. As an optimization, the Go source might lower into multiple edges between the same two locations; it's only necessary to track the shortest edge. Once we have a graph representing one or more functions, we can walk the graph to compute each node's "distance" from the sink and from any return parameters. This is basically just a bunch of shortest-path computations, except that distance is bounded as non-negative. For example, if p has distance 0 from some location, and there's an edge Finally, we can observe two things from the resulting distance computations:
-- Notably, esc.go doesn't exactly follow this approach. Instead of simply constructing a graph and walking it as I describe above, escassign/escflow construct a partial graph (synthesizing OADDR/ODEREF nodes in lieu of tracking edge lengths directly) and escflood/escwalk walk the remaining edges. The subsequent computations based on the distance calculations are also intermingled into the walking logic, and tracking/recording distances is much more complicated. |
That's a nicer description than mine, I think. I am still trying to figure out an intuitive explanation of "suffixValue". |
As best I can tell, the suffixValue logic is captured in my simplified description by having distances bottom out at 0. Basically, it's to ensure that given
then p needs to be heap allocated, but q doesn't. |
As an update, I have a WIP branch where I've written a new escape analysis pass. You can see the new code here: https://github.com/mdempsky/go/blob/esc5/src/cmd/compile/internal/gc/esc2.go It's still a little rough and I need to add documentation and better debugging facilities. (The latter being mostly ad hocly developed while I bring it up to feature parity with esc.go.) However, I believe it's about on par with esc.go in terms of correctness, and it's less than half the number of lines of code (1117 vs 2425). It roughly follows the description I made above: the stmts/stmt/value/assign/call methods all trudge through the AST tracking pointer flows and constructing a graph of EscLocations, and then flood+walk do a simple walk over the graph identifying locations that escape and value flows from pointers to result parameters. The graphs it builds also tend to be much smaller than esc.go's. For example:
esc2.go only needed about 40% as many graph nodes as esc.go, and about 80% as many edges. There's room for further improvement here too. My next immediate goal is continuing to dig through the differences in behavior between esc.go and esc2.go, and then polishing it up for review for the next dev cycle. |
I'm repurposing this issue for tracking landing my escape analysis rewrite. See also my updates on golang-dev@ (which apparently are intermingled into the general Go 1.13 planning thread, despite rsc's attempt to avoid that): https://groups.google.com/d/msg/golang-dev/jln8MwFpATc/k78gXa3bDwAJ |
Change https://golang.org/cl/167715 mentions this issue: |
This logic is used by the current escape analysis pass, but otherwise logically independent. Move (unchanged) into a separate file to make that clearer, and to make it easier to replace esc.go later. Updates #23109. Change-Id: Iec8c0c47ea04c0008165791731c11d9104d5a474 Reviewed-on: https://go-review.googlesource.com/c/go/+/167715 Reviewed-by: Robert Griesemer <gri@golang.org>
Change https://golang.org/cl/170319 mentions this issue: |
For most nodes (e.g., OPTRLIT, OMAKESLICE, OCONVIFACE), escape analysis prints "escapes to heap" or "does not escape" to indicate whether that node's allocation can be heap or stack allocated. These messages are also emitted for OADDR, even though OADDR does not actually allocate anything itself. Moreover, it's redundant because escape analysis already prints "moved to heap" diagnostics when an OADDR node like "&x" causes x to require heap allocation. Because OADDR nodes don't allocate memory, my escape analysis rewrite doesn't naturally emit the "escapes to heap" / "does not escape" diagnostics for them. It's also non-trivial to replicate the exact semantics esc.go uses for OADDR. Since there are so many of these messages, I'm disabling them in this CL by themselves. I modified esc.go to suppress the Warnl calls without any other behavior changes, and then used a shell script to automatically remove any ERROR messages mentioned by run.go in "missing error" or "no match for" lines. Fixes golang#16300. Updates golang#23109. Change-Id: I3993e2743c3ff83ccd0893f4e73b366ff8871a57
For most nodes (e.g., OPTRLIT, OMAKESLICE, OCONVIFACE), escape analysis prints "escapes to heap" or "does not escape" to indicate whether that node's allocation can be heap or stack allocated. These messages are also emitted for OADDR, even though OADDR does not actually allocate anything itself. Moreover, it's redundant because escape analysis already prints "moved to heap" diagnostics when an OADDR node like "&x" causes x to require heap allocation. Because OADDR nodes don't allocate memory, my escape analysis rewrite doesn't naturally emit the "escapes to heap" / "does not escape" diagnostics for them. It's also non-trivial to replicate the exact semantics esc.go uses for OADDR. Since there are so many of these messages, I'm disabling them in this CL by themselves. I modified esc.go to suppress the Warnl calls without any other behavior changes, and then used a shell script to automatically remove any ERROR messages mentioned by run.go in "missing error" or "no match for" lines. Fixes #16300. Updates #23109. Change-Id: I3993e2743c3ff83ccd0893f4e73b366ff8871a57 Reviewed-on: https://go-review.googlesource.com/c/go/+/170319 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com>
Change https://golang.org/cl/170321 mentions this issue: |
Change https://golang.org/cl/170322 mentions this issue: |
"leaking closure reference" is redundant for similar reasons as "&x escapes to heap" for OADDR nodes: the reference itself does not allocate, and we already report when the referenced variable is moved to heap. "mark escaped content" is redundant with "leaking param content". Updates #23109. Change-Id: I1ab599cb1e8434f1918dd80596a70cba7dc8a0cf Reviewed-on: https://go-review.googlesource.com/c/go/+/170321 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
I've uploaded CL 170322, which is in good enough shape for people to try out, if they're interested. Here are some compilebench numbers:
And here's a list of expressions that esc.go heap allocated, but that escape.go stack allocates: Escape analysis improvements (350 lines)
There are no instances in the main repo where a variable used to be stack allocated, but now heap allocates. (In particular, the regressions that were discussed on golang-dev@ have all been fixed.) I need to figure out why the js/wasm bot is failing, and polish the documentation further. Then I'll prepare for review. |
Change https://golang.org/cl/170323 mentions this issue: |
Package unsafe's safety rules require that pointers converted to uintptr must be converted back to pointer-type before being stored into memory. In particular, storing a pointer into a non-pointer-typed expression does not guarantee the pointer stays valid, even if the expression refers to a pointer-typed variable. wasm's StorepNoWB implementation violates these rules by storing a pointer through a uintptr-typed expression. This happens to work today because esc.go is lenient in its implementation of package unsafe's rules, but my escape analysis rewrite follows them more rigorously, which causes val to be treated as a non-leaking parameter. This CL fixes the issue by using a *T-typed expression, where T is marked //go:notinheap so that the compiler still omits the write barrier as appropriate. Updates #23109. Change-Id: I49bc5474dbaa95729e5c93201493afe692591bc8 Reviewed-on: https://go-review.googlesource.com/c/go/+/170323 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
Change https://golang.org/cl/170447 mentions this issue: |
Change https://golang.org/cl/170448 mentions this issue: |
Change https://golang.org/cl/170450 mentions this issue: |
Updates #23109. Change-Id: I55f7860c868acc948a6397ab6a9295e177724a56 Reviewed-on: https://go-review.googlesource.com/c/go/+/170450 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
This CL adds a new escape analysis implementation, which can be enabled through the -newescape compiler flag. This implementation focuses on simplicity, but in the process ends up using less memory, speeding up some compile-times, fixing memory corruption issues, and overall significantly improving escape analysis results. Updates #23109. Change-Id: I6176d9a7ae9d80adb0208d4112b8a1e1f4c9143a Reviewed-on: https://go-review.googlesource.com/c/go/+/170322 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
The new escape analysis implementation tries to emit debugging diagnostics that are compatible with the existing implementation, but there's a handful of cases that are easier to handle by updating the test expectations instead. For regress tests that need updating, the original file is copied to oldescapeXXX.go.go with -newescape=false added to the //errorcheck line, while the file is updated in place with -newescape=true and new test requirements. Notable test changes: 1) escape_because.go looks for a lot of detailed internal debugging messages that are fairly particular to how esc.go works and that I haven't attempted to port over to escape.go yet. 2) There are a lot of "leaking param: x to result ~r1 level=-1" messages for code like func(p *int) *T { return &T{p} } that were simply wrong. Here &T must be heap allocated unconditionally (because it's being returned); and since p is stored into it, p escapes unconditionally too. esc.go incorrectly reports that p escapes conditionally only if the returned pointer escaped. 3) esc.go used to print each "leaking param" analysis result as it discovered them, which could lead to redundant messages (e.g., that a param leaks at level=0 and level=1). escape.go instead prints everything at the end, once it knows the shortest path to each sink. 4) esc.go didn't precisely model direct-interface types, resulting in some values unnecessarily escaping to the heap when stored into non-escaping interface values. 5) For functions written in assembly, esc.go only printed "does not escape" messages, whereas escape.go prints "does not escape" or "leaking param" as appropriate, consistent with the behavior for functions written in Go. 6) 12 tests included "BAD" annotations identifying cases where esc.go was unnecessarily heap allocating something. These are all fixed by escape.go. Updates #23109. Change-Id: Iabc9eb14c94c9cadde3b183478d1fd54f013502f Reviewed-on: https://go-review.googlesource.com/c/go/+/170447 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
Despite repeated efforts, I'm still confused by our escape analysis code. Also, based on discussion with other compiler devs, apparently this difficulty understanding is not unique to me. It would be good to better document this code so we can hopefully simplify and improve it.
Some specific questions I've had trying to understand the code:
The "Escape analysis" documentation has a very high-level explanation, but leaves me with more questions than answers:
It mixes discussion of "nodes" and "variables," which makes it unclear to me whether expressions other than ONAME Nodes are present in the flow graph. (I think they are, otherwise I can't make sense of how we know if
new(T)
can be stack allocated or not.) It also talks about "values" having addresses, when only variables do. Using terminology consistent with the Go spec would help, IMO.It mentions storing edges between "pointer-containing nodes". It would be nice to understand how this is expected to work in the presence of
unsafe.Pointer
<->uintptr
conversions.Pseudo-code like
flow(closure, &var)
is confusing too: what's the&
here supposed to signify?I can't make sense of the phrase "tags all variables of it can reach an & node as escaping".
The Level godocs are super opaque to me:
I kind of understand the meaning of
value
, but I don't completely understand why adding the +1 and -1 values is sound.I have no idea what a "maximum-copy-started-suffix-level" is, and the
x.left.left
,&Node{x}
, etc. examples don't really help. I'm particularly confused because aNode
can represent so many different operations depending on theOp
, and I'm lost on what semantic meaning we're divining from syntactic tree depth.escAnalyze has a reflooding loop that keeps looping until fixed point. It's not obvious to me why this is necessary. (At least experimentally, I do see that "Reflooding foo" shows up when building the standard library with -m=3.)
/cc @dr2chase @cherrymui @paranoiacblack
The text was updated successfully, but these errors were encountered: