Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: compiler can unexpectedly preserve memory #22350

Closed
zardlee1937 opened this issue Oct 20, 2017 · 68 comments
Closed

cmd/compile: compiler can unexpectedly preserve memory #22350

zardlee1937 opened this issue Oct 20, 2017 · 68 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@zardlee1937
Copy link

zardlee1937 commented Oct 20, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go1.9.1

Does this issue reproduce with the latest release?

yes. every release.

What operating system and processor architecture are you using (go env)?

GOARCH=amd64
GOBIN=C:\Go\bin
GOEXE=.exe
GOHOSTARCH=amd64
GOHOSTOS=windows
GOOS=windows
GOPATH=D:\golang
GORACE=
GOROOT=C:\Go
GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
GCCGO=gccgo
CC=gcc
GOGCCFLAGS=-m64 -fmessage-length=0 -fdebug-prefix-map=C:\Users\zhalei\AppData\Local\Temp\go-build309785515=/tmp/go-build -gno-record-gcc-switches
CXX=g++
CGO_ENABLED=1
PKG_CONFIG=pkg-config
CGO_CFLAGS=-g -O2
CGO_CPPFLAGS=
CGO_CXXFLAGS=-g -O2
CGO_FFLAGS=-g -O2
CGO_LDFLAGS=-g -O2

What did you do?

Write codes for test GC of golang.
Here is what i use to test.

package main

import "fmt"
import "time"

type Node struct {
	next     *Node
	payload  [64]byte
}

func main() {
	root := new(Node)
	curr := root
	i := 0
	
	lastTime := time.Now()
	for {
		currTime := time.Now()
		elapsed := currTime.Sub(lastTime)
		lastTime = currTime
		// 10ms = 10000000ns
		if elapsed > 10000000 {
			fmt.Println("StopTime:", elapsed)
		}
		
		curr.next = new(Node)
		curr = curr.next
		
		i++
		//3000 nodes max
		if i >= 3000 {
			i = 0
			root = curr
		}
	}
}

What did you expect to see?

the program run well.

What did you see instead?

memory never be released until everything stop work. and then, i take my pc power off. -_-

@zardlee1937 zardlee1937 changed the title Seems like something wrong with GC GC can not colect garbage issue Oct 20, 2017
@zardlee1937 zardlee1937 changed the title GC can not colect garbage issue GC cannnot colect garbage issue Oct 20, 2017
@zardlee1937 zardlee1937 changed the title GC cannnot colect garbage issue GC cannnot collect garbage issue Oct 20, 2017
@davecheney
Copy link
Contributor

Here is a smaller reproduction which grows linearly on my machine

package main

type Node struct {
        next    *Node
        payload [64]byte
}

func main() {
        curr := new(Node)
        for {
                curr.next = new(Node)
                curr = curr.next
        }
}

@mattn
Copy link
Member

mattn commented Oct 20, 2017

I think this is not a bug since curr.next refer the pointer even if the variable is updated. This is not weak refernece.

@davecheney
Copy link
Contributor

davecheney commented Oct 20, 2017

@mattn i'm sorry I don't understand what you mean. I've stared at this code a bunch, and it looks like to me that on each iteration curr is replaced with curr.next, so the previous value of curr is no longer referenced.

  +------+---------+
  + next | payload |
  +------+---------+
       |
       |    +------+---------+
curr ->`--> | next | payload |
            +------+---------+

And so on.

I'm sure i'm missing something, so I'd appreciate someone helping me understand what I'm missing.

@ianlancetaylor ianlancetaylor changed the title GC cannnot collect garbage issue cmd/compile: compiler can unexpectedly preserve memory Oct 20, 2017
@ianlancetaylor
Copy link
Contributor

At least @davecheney 's version is an interesting result of compiler optimization. The very first call to new(Node) never escapes, so the compiler allocates it on the stack. It's next pointer is set to point to the next allocated Node. The Node chain continues to build, but nothing ever clears the the stack allocated pointer, so all the Nodes appear live to the garbage collector.

Basically 1) the fact that you use the same Go variable doesn't mean that the compiler does the same; 2) the fact that you call new doesn't mean that you always get heap memory.

Not sure whether or how to fix this. Does it come up in a real program?

CC @randall77

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 20, 2017
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Oct 20, 2017
@mattn
Copy link
Member

mattn commented Oct 20, 2017

As long as curr.next holds the reference of *Node (i.e. strong reference), it will not be freed, I belieave. So if make it gc, Node should have uintptr instead of *Node.

package main

import (
	"runtime"
	"unsafe"
)

type Node struct {
	next    uintptr
	payload [64]byte
}

func main() {
	curr := new(Node)
	for {
		curr.next = uintptr(unsafe.Pointer(new(Node)))
		curr = (*Node)(unsafe.Pointer(curr.next))
		runtime.GC()
	}
}

Of course this is a dangerous code. Because the value indicated by next may be a pointer already freed.

@davecheney
Copy link
Contributor

@mattn

As long as curr.next holds the reference of *Node (i.e. strong reference), it will not be freed,

But curr is overwritten every loop, so on the next loop, curr.next is actually curr.next.next and so on. The previous value held in curr is now unreferenced so even though it's next field points to the current curr nothing points to it, the previous value of curr; ergo, it's garbage.

@davecheney
Copy link
Contributor

davecheney commented Oct 20, 2017

@ianlancetaylor thanks for the explanation. I think, irrespective of if this comes up in real code or not, it's still quite a serious bug.

@mattn
Copy link
Member

mattn commented Oct 20, 2017

@davecheney
curr is overwritten but not overwritten the memory block pointed by curr.
pointer is a reference just not a value. if something refer the memory block, GC can sweep for marking.

@mattn
Copy link
Member

mattn commented Oct 20, 2017

@davecheney It GC collect the unreferenced-value immediately as you mentioned, following code can not make linked-list.

package main

type Node struct {
	next     *Node
	payload  [64]byte
}

func NewNode(curr *Node) *Node {
	newnode := new(Node)
	newnode.next = curr
	return newnode
}

func main() {
	curr := NewNode(nil)
	curr = NewNode(curr)
	curr = NewNode(curr)
}

@davecheney
Copy link
Contributor

davecheney commented Oct 20, 2017 via email

@davecheney
Copy link
Contributor

davecheney commented Oct 20, 2017 via email

@mattn
Copy link
Member

mattn commented Oct 20, 2017

@davecheney do you mean we can't make linked-list with following code?

        curr := new(Node)
	for i := 0; i < 5; i++ {
                curr.next = new(Node)
                curr = curr.next
        }
        // now curr is top of linked-list

@davecheney
Copy link
Contributor

davecheney commented Oct 20, 2017 via email

@mattn
Copy link
Member

mattn commented Oct 20, 2017

No. this is not a bug. This works as intended. And changing this behavior is language change.

See https://golang.org/src/container/ring/ring.go#L61

@davecheney
Copy link
Contributor

davecheney commented Oct 20, 2017 via email

@anydream
Copy link

https://golang.org/src/container/ring/ring.go#L69
p.next = &Ring{prev: p}

@mattn Have you ever seen {prev: p}?

@mattn
Copy link
Member

mattn commented Oct 20, 2017

As far as i know, this is usual/general way to make linked list which have size.

@anydream
Copy link

anydream commented Oct 20, 2017

@mattn Your example is doubly linked-list, but the bug of this issue is singly linked-list. They are completely different
Nodes of singly linked-list doesn't reference the previous one.
So your example is unreasonable.

@mattn
Copy link
Member

mattn commented Oct 20, 2017

p = p.next

In your thought, p.next.prev should be garbage-collected?

BTW, how garbages will be corrected? update curr.next.next or curr.next to nil? I don't know such a behavior in other langues. For example, in Java, WeakReference have accessor to get weak-referenced value.

@anydream
Copy link

anydream commented Oct 20, 2017

@mattn
p = p.next
After that operation, the previous Node cannot be referenced.
Allocated memories that cannot be referenced is garbage.

             +------+---------+
*before* p:  + next | payload | <-- This is garbage!
             +------+---------+
                |
                |    +------+---------+
*after*  p:     `--> | next | payload |
                     +------+---------+

@mattn
Copy link
Member

mattn commented Oct 20, 2017

Ah, Sorry, I was confused. :(

This code doesn't have top of the chain. So un-referenced curr should be garbage-collected.

@cznic
Copy link
Contributor

cznic commented Oct 20, 2017

@mattn

As Ian pointed out, the first Node is allocated on stack. You cannot garbage collect stack memory, but that stack-allocated Node references transitively all other Nodes allocated later in heap.

@mattn
Copy link
Member

mattn commented Oct 20, 2017

package main

type Node struct {
	next    *Node
	payload [64]byte
}

func NewNode() *Node {
	return &Node{}
}

func main() {
	curr := NewNode()
	for {
		curr.next = NewNode()
		curr = curr.next
	}
}

This is replaced inline(stack) operation?

@cznic
Copy link
Contributor

cznic commented Oct 20, 2017

I don't know, it's different code. I think there's a compiler flag that tells what's escaping or not, so you can test both versions and look for differences, if any.

@ianlancetaylor
Copy link
Contributor

@cznic This probably won't be particularly helpful, but the commit that introduced stack allocation for non-escaping calls to new, and non-escaping composite literals, is https://golang.org/cl/4954043 .

@ianlancetaylor
Copy link
Contributor

One way to view this class of problems is that a pointer P to a value V is allocated on the stack, but that P dies and nothing else points to V, so V should die. If V were allocated on the heap all would work as programmers intuitively expect, but because V is allocated on the stack it in effect remains live until the function returns.

So the first question is: can we detect that there are no pointers to V? I believe the answer may be yes. On the assumption that pointers to the stack can only be found on the stack, we could build a liveness map of the stack during GC stack scanning, and use that to find all values on the stack that are dead. This would require the stack map to record not only pointer positions, but also object sizes, since liveness is per-object, not per-pointer. I think we could do this by using two bits per word in the stack map, as opposed to the single bit we currently use.

The second question: is there anything we can do? The answer there is definitely yes: when we find that V is dead, we can zero it out.

If this is possible, then the obvious downside is slower stack scanning. I think we will only defeat programmer intuition with objects that appear to be heap allocated, but are stack allocated because they do not escape, that themselves contain pointers into the heap. Perhaps the compiler can mark these cases such that the runtime only has to do this special scanning if they exist.

@hyangah
Copy link
Contributor

hyangah commented Oct 30, 2017

I recently encountered an issue of increased memory usage after minor refactoring of existing code using a closure and @mdempsky pointed me to this issue. So I am adding my example here.

https://play.golang.org/p/zA9iHzDIYi

Basically, the function closure here that contains a reference to the byte slice is stack allocated, so the byte slice remains alive until G returns. GC makes no guarantees about when the memory is garbage collected, thus it's not incorrect behavior. But it's rather surprising and it was hard to diagnose.

@cznic
Copy link
Contributor

cznic commented Oct 30, 2017

GC makes no guarantees about when the memory is garbage collected, thus it's not incorrect behavior.

Then you can just eliminate the GC completely. Can such language be still called "garbage collected"? I don't think so. IMO, a garbage collected language rutnime must guarantee at minimum that OOM will not happen when there's enough of non reachable allocations to reclaim that can satisfy the current allocation request.

edit: typos

@randall77
Copy link
Contributor

randall77 commented Oct 30, 2017

@cznic It's hard to make that specification precisely. Consider:

func f(a []int) {
    runtime.GC()
    if false {
        println(a[0])
    }
}

So can the memory pointed to by a be collected at the GC? If the compiler does dead code elimination, then it can be. Otherwise, it can't.
We only guarantee that something live won't be collected. Of course unreachable objects should be collected, but there's a fuzzy boundary there which is hard (and might tie the hands of future implementations too much) to specify precisely.

@mdempsky
Copy link
Member

mdempsky commented Oct 30, 2017

@randall77 To be fair, in @hyangah's repro, the issue is the goroutine is stuck at select {} without any possible future references to the byte slice, yet the slice is still kept alive.

I agree any definition of "GC correctness" probably shouldn't rely on knowing ahead of time which branch path will be taken, but it seems like if no branch could possibly reference a variable, it should be GC'd.

(That said, regardless of whether we define this as an issue or not, I'm not sure how to effectively address it at the moment.)

@RLH
Copy link
Contributor

RLH commented Oct 30, 2017 via email

@cznic
Copy link
Contributor

cznic commented Oct 30, 2017

@randall77

I'd not say it's fuzzy, IMO. Using/not using DCE makes it into two different programs, from GC's POV, but the cut is still clear: Unreachable memory is memory which cannot be transitively reached from any root observable by the program as a whole, including the runtime.

This issue is about a particular setup where heap allocations are not transitively reachable from any root observable by the program but still not eligible for collection.

@RLH

In general whether an object is reachable from some point in a program is
not decidable.

Did you mean 'statically not decidable'? Anyway, this issue is, I believe, not about what can be said statically about liveness. IIUC, the stack is a GC root, but at the same time the stack contains unreachable (stack allocated) memory transitively pointing to heap. I think that's what's relevant here. If static analysis can improve things, all good, but the proper solution will/should IMO work without any static analysis.

It's just the assumption that stack is in its entirety a GC root that does not hold - and enables this issue to demonstrate itself.

edit: typos

@randall77
Copy link
Contributor

Using/not using DCE makes it into two different programs, from GC's POV

I agree, and that is the problem.

The question is, what should the spec say about my example? Does the spec say that a can be garbage collected, or not? It isn't two different programs according to the spec.

My argument is we can't add anything to the spec that fixes whether a is live or isn't live, because that precludes implementations we have. For example, go1.7 has better DCE than go1.6. If the spec says that a is live, then go1.7 doesn't conform to the spec. If the spec says that a is not live, then go1.6 doesn't conform to the spec.

@ianlancetaylor
Copy link
Contributor

We might consider disabling stack allocation for x := new(...) if there is any reassignment of x elsewhere in the function (or just in a loop?). That seems pretty harsh, though, and is liable to cause more problems than it solves.

Note that this suggestion does not fix the case reported by @hyangah . In her example the function closure being held on the stack is not reassigned anywhere.

@ianlancetaylor
Copy link
Contributor

@cznic

This issue is about a particular setup where heap allocations are not transitively reachable from any root observable by the program but still not eligible for collection.

As you know, the current GC treats the entire stack as a root. So by that definition the heap allocation is transitively reachable from a root.

I assume that what you are trying to get at is that we shouldn't treat the entire stack as a root. That seems similar to what @mdempsky suggested above:

Partial GC scanning of stacks. Instead of recording stack autotmps' liveness in stackmaps, we could leave it up to GC to identify them and trace them out. This would require the compiler to generate additional stackmap-like data structures, and additional GC algorithms for scanning stacks.

@randall77
Copy link
Contributor

I did a little bit of exploring in the stdlib to see how often things like this happen.

I wrote a go/ast program to look through the stdlib for occurrences of

  1. A new(...) immediately assigned to a local variable.
  2. That variable assigned in at least one other location.

There were about 100 of these. I then merged this list with the output of the compiler's -m flag where it reports that the new does not escape. That left me with 6 instances:

src/crypto/elliptic/elliptic.go:253:10
src/crypto/elliptic/elliptic.go:253:10
src/crypto/elliptic/elliptic.go:253:10
src/math/big/int.go:498:4
src/math/big/int.go:501:4
src/math/big/int.go:546:4

The crypto/elliptic case is allocating a few new(big.Int) as a zero values and then overwriting them with a non-zero value some time later. The original empty big.Int never points at anything, so it's a benign case - the autotmp allocation never points to any heap objects.

The math/big case is also allocating with new(big.Int). This case is trickier, as the autotmps end up being modified to hold pointers into the heap. But the references to the autotmps are never dropped, just shuffled around, so the autotmps are alive throughout the function. There isn't anything being retained in the heap that would have been collected if we had disabled stack allocation.

So TL;DR this issue doesn't happen in the stdlib (caveat: I didn't check the test packages).

Hard to know exactly what to conclude from this, but at least we have some evidence that this isssue is fairly rare.

@cznic
Copy link
Contributor

cznic commented Nov 17, 2017

@randall77

I wrote a go/ast program to look through the stdlib for occurrences of

Can you please use your program program to check the corpus? Or, even better, make the code available somewhere?

@randall77
Copy link
Contributor

Here's the program:

package main

import (
	"fmt"
	"go/ast"
	"go/parser"
	"go/token"
	"os"
)

func main() {
	for _, pkg := range os.Args[1:] {
		doPkg(pkg)
	}
}

func doPkg(pkg string) {
	fset := token.NewFileSet()
	pkgs, err := parser.ParseDir(fset, pkg, func(os.FileInfo) bool { return true }, 0)
	if err != nil {
		panic(err)
	}
	for _, p := range pkgs {
		for _, f := range p.Files {
			for _, d := range f.Decls {
				if fn, ok := d.(*ast.FuncDecl); ok {
					checkFn(fn, fset)
				}
			}
		}
	}
}

func checkFn(fn *ast.FuncDecl, fset *token.FileSet) {
	// Count number of assignments to each variable.
	assignments := map[*ast.Object]int{}
	ast.Inspect(fn, func(n ast.Node) bool {
		switch x := n.(type) {
		case *ast.AssignStmt:
			for _, y := range x.Lhs {
				if id, ok := y.(*ast.Ident); ok {
					assignments[id.Obj]++
				}
			}
		}
		return true
	})

	ast.Inspect(fn, func(n ast.Node) bool {
		switch x := n.(type) {
		case *ast.AssignStmt:
			for i, r := range x.Rhs {
				c, ok := r.(*ast.CallExpr)
				if !ok {
					continue
				}
				name, ok := c.Fun.(*ast.Ident)
				if !ok {
					continue
				}
				if name.Name != "new" {
					continue
				}
				l := x.Lhs[i]
				if id, ok := l.(*ast.Ident); ok && assignments[id.Obj] > 1 {
					fmt.Printf("%s %s XXX\n", fset.Position(x.TokPos).String()[28:], id)
				}
			}
		}
		return true
	})
}

(Replace 28 with the right constant for your current working directory.)

Get the list of std packages with

go list ./... | cut -d_ -f2- > pkgs

Run the program above with

cat pkgs | xargs ./program > log

Find all the non-escaping allocations in the std lib

GO_GCFLAGS=-m ./make.bash 2>&1 | cat > log2

Merge the results and find matching positions:

cat log log2 | sort | grep -C1 XXX

From the output you can eyeball all of the places where the assignments match a "does not escape" message.

@chengzhicn
Copy link

From what i read, I think this issue related to #19469, and that issue cost me hours to debug. It's really hard for newcomer like me.
Before the workaround, each task cost approx. 200MB during it's lifetime(several hours), task itself is not memory intensive nor cpu intensive, it's just occasional need to connect some server and exchange some data.
There are approx. a thousand tasks need to run concurrently which need a huge amount of memory before workaround applied.
This issue also impose a peak memory requirement which must be satisfied or program may be killed by OOM killer since computer only have finite memory, but that memory requirement is hard to calculate and easily overlooked.
I have to remind myself to split function that may use a lot of memory and take a lot of time so my program would have an expected memory requirement and won't be killed by OOM killer, but the process is burdensome and error prone.

@kklobe
Copy link

kklobe commented Apr 8, 2018

naive question: given @randall77's code, is this being considered for go vet?

@randall77
Copy link
Contributor

@kklobe, probably not. Vet requires a very low false positive rate, as it is run by default. vet errors in the stdlib are a no-go. (Assuming the false positives my code generates can't be fixed.)

Possibly something for go/lint, although that's more for style than substance. But in any case I'm not sure anyone knows what the right algorithm might be. Mine was kind of a hack.

@docmerlin
Copy link

I ran into this problem in prod/ It resulted in OOM kill and some nasty refactoring and hackery to make it behave properly. In our case we were able to work around it somewhat, but it is still pretty annoying.

@dr2chase
Copy link
Contributor

Did your bug fit the pattern

  ptr := &StackallocatedInitialValue
  ...
  ptr.field = bigHeapThing
  ...
  ptr = someOtherValue

or was it something more obfuscated than that?

@docmerlin
Copy link

It was quite a bit more obfuscated than that, but fit the general pattern.

@gopherbot
Copy link

Change https://golang.org/cl/134155 mentions this issue: cmd/compile,runtime: implement stack objects

@gopherbot
Copy link

Change https://golang.org/cl/134156 mentions this issue: cmd/compile,runtime: remove ambiguously live logic

gopherbot pushed a commit that referenced this issue Oct 3, 2018
Rework how the compiler+runtime handles stack-allocated variables
whose address is taken.

Direct references to such variables work as before. References through
pointers, however, use a new mechanism. The new mechanism is more
precise than the old "ambiguously live" mechanism. It computes liveness
at runtime based on the actual references among objects on the stack.

Each function records all of its address-taken objects in a FUNCDATA.
These are called "stack objects". The runtime then uses that
information while scanning a stack to find all of the stack objects on
a stack. It then does a mark phase on the stack objects, using all the
pointers found on the stack (and ancillary structures, like defer
records) as the root set. Only stack objects which are found to be
live during this mark phase will be scanned and thus retain any heap
objects they point to.

A subsequent CL will remove all the "ambiguously live" logic from
the compiler, so that the stack object tracing will be required.
For this CL, the stack tracing is all redundant with the current
ambiguously live logic.

Update #22350

Change-Id: Ide19f1f71a5b6ec8c4d54f8f66f0e9a98344772f
Reviewed-on: https://go-review.googlesource.com/c/134155
Reviewed-by: Austin Clements <austin@google.com>
@golang golang locked and limited conversation to collaborators Oct 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests