Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: runtime: add a mechanism for specifying a minimum target heap size #23044

Closed
cespare opened this issue Dec 8, 2017 · 35 comments
Closed

Comments

@cespare
Copy link
Contributor

cespare commented Dec 8, 2017

I propose that we add a GC knob (either an environment variable or a function in runtime/debug) which allows users to set the minimum target heap size for the garbage collector.

For now I'll call this setting GOGCMIN pending a better name.

Right now, the one GC tunable available to users is GOGC. From the runtime documentation:

The GOGC variable sets the initial garbage collection target percentage. A collection is triggered when the ratio of freshly allocated data to live data remaining after the previous collection reaches this percentage. The default is GOGC=100.

The idea is that when the target heap size is calculated based on live data and GOGC, if that target is less than GOGCMIN, then GOGCMIN is used instead.

The problem

It has often been noted that programs which make a lot of allocations while maintaining a small live heap end up doing excessive garbage collections.

At my company, we've noticed this a number of times. It's typically a problem with data processing applications which might read and write messages from queues at a large rate, yet keep very little data live over the long term.

We've had to address CPU usage due to such excessive garbage collections for at least three separate applications. Here are two real situations we observed:

  • 29 GB of system memory; 40 MB heap; 500 MB/sec allocation; 30 collections/sec
  • 480 GB of system memory; 100 MB heap; 700 MB/sec allocation; 12 collections/sec

In all cases, we would prefer that the application use a lot more memory in order to do fewer collections.

Existing workarounds

GOGC

The knob that's available for controlling this situation is GOGC, described above. When we've come across these issues in the past, we've set high GOGC values and that largely fixes the problem, at least in the short term.

Unfortunately, this is a fragile fix. We don't actually care about the GOGC ratio; we want to target a particular heap size. So if we have a 40 MB heap, we might back into a GOGC value like 1000 or even 10000 in order to target a 400 MB or 4 GB heap size, respectively. With large GOGC values, the application is extremely sensitive to small increases in the live heap size: if our 40 MB heap increases to 100 MB (not a large jump), then our 4 GB target becomes 10 GB.

In fact, we recently had crashes with one application where we set GOGC=1200 many months ago when its typical heap size was a few hundred MB. The live data size increased to several GB and then the service started OOMing.

SetGCPercent

One way to address the shortcomings of GOGC is to dynamically adjust its value. This is possible by using runtime/debug.SetGCPercent.

We tried a solution that involved a long-running goroutine in every application watching memory use (via runtime.ReadMemStats) and adjusting SetGCPercent.

There are at least two problems with this approach:

  • We can't react to arbitrarily fast increases in heap size. If the heap is very small and we set SetGCPercent very high, we are prone to OOMing if the heap grows suddenly before we adjust SetGCPercent again. (And we don't want to make the adjustments too frequently because it's not cheap.)
  • As far as I can tell, the live heap size from the end of the previous collection is not exposed by runtime.MemStats. (I think it's only available by parsing gctrace output.) This is the number we really need in order to accurately pick a GOGC value.

Heap ballast

We eventually settled on an awful workaround: we have a long-running goroutine manage a set of dummy allocations (ballast). When the heap is small, the ballast is large; if the live heap reaches the target size, the ballast shrinks to zero.

We can't pick the ballast as accurately as we would like because, again, we need the live heap size from the previous GC cycle. But by using the total non-ballast heap size as a conservative proxy, the solution works well enough. In particular, by keeping GOGC at a normal level (usually 100), we aren't subject to the heap size spike issue.

Obviously this isn't a great solution for the long term since it wastes memory that could otherwise be used for something else (like disk cache).

Related discussions

A related idea is to have a mechanism for limiting the max heap size (see #16843 and other linked discussions). However, I believe that a min size is a much simpler problem to solve since it doesn't require application coordination (backpressure).

Also related to the old issue #9067.


I'm happy to give this the full proposal document treatment if that's useful. It seems like a simple idea that doesn't necessarily need it.

Based on my limited understanding of the garbage collector, this would be easy to implement and wouldn't add much complexity to the GC.

/cc @aclements @RLH

@cespare cespare added this to the Go1.11 milestone Dec 8, 2017
@RLH
Copy link
Contributor

RLH commented Dec 20, 2017

The root cause is that the GC does not know how much RAM is available so it is understandable conservative. If the GC knows how much RAM it has to play with then the problem could be transparently handled by the GC. As the related discussion notes, #16843 proposes a mechanism for specifying a maximum heap size. Once the GC knows the maximum heap available it can use GC frequency to help determine when to start the next GC cycle. One idea is to increase heap size until the GC runs at a reasonable frequency or the maximum heap size is reached. While this proposal is much simpler it does not address many of the issues #16843 addresses.

@cespare
Copy link
Contributor Author

cespare commented Dec 20, 2017

@RLH I agree that some solutions to #16843 might address this, but frankly the discussion around that problem has not looked promising from the outside. It seems like the theoretical and technical barriers to a good max-heap API are large.

Meanwhile, the GC frequency problems described in this issue are -- to us, at least -- more pressing than the max-heap issue, admit a much simpler solution, and have only bad workarounds today.

@RLH
Copy link
Contributor

RLH commented Dec 21, 2017

We are working our way through the issues related to #16843. We have a prototype implementation at https://go-review.googlesource.com/c/go/+/46751 the community is welcome to comment on. Such public prototypes are intended to address the implementation barriers while discussion on the theoretical barriers are part of the process. Adding knobs to the GC is very heavy weight process.

That said your problems are real. Perhaps a discussion concerning what the GC should use internally as the default minimum heap size, currently 4MB, is warranted. For example would a default heap size derived from typically cloud virtual machine instances be a reasonable start? In December 2017 a standard 4 CPU instances (think GOMAXPROCS) comes with 16GB. A "high CPU" 4 CPU instance come with 3.6 GB. #16854 could be used to limit heap size to something below the default so small heap functionality wouldn't be lost.

@cespare
Copy link
Contributor Author

cespare commented Dec 21, 2017

We are working our way through the issues related to #16843. We have a prototype implementation at https://go-review.googlesource.com/c/go/+/46751 the community is welcome to comment on. Such public prototypes are intended to address the implementation barriers while discussion on the theoretical barriers are part of the process. Adding knobs to the GC is very heavy weight process.

Thanks. It seems like CL 46751 doesn't address my issue, though. (At least I don't see how from the documentation; I haven't understood all of the code.)

Furthermore, I don't necessarily have any kind of pushback mechanism to react to the notifications and I would not like to replace the current fast-OOM behavior on excessively large heaps (good) with any kind of thrashing/death spiral (bad).

That said your problems are real. Perhaps a discussion concerning what the GC should use internally as the default minimum heap size, currently 4MB, is warranted. For example would a default heap size derived from typically cloud virtual machine instances be a reasonable start? In December 2017 a standard 4 CPU instances (think GOMAXPROCS) comes with 16GB. A "high CPU" 4 CPU instance come with 3.6 GB. #16854 could be used to limit heap size to something below the default so small heap functionality wouldn't be lost.

If the minimum were raised to, say, [40 MB × GOMAXPROCS] then that would be helpful to us.

I expect the people running Go on raspberry pis and such would have something to say about that though.

@RLH
Copy link
Contributor

RLH commented Dec 21, 2017 via email

@rsc
Copy link
Contributor

rsc commented Jan 29, 2018

ping @aclements @RLH

@elvarb
Copy link

elvarb commented Feb 19, 2018

I have a similar problem but a different use case. The program is processing lots of data from disk and can run for a very long time, the memory usage of the Go program itself is not much of an issue but the server that it is running on does more. When the server is running other tasks, for example backups, it can run out of memory. This is not a problem for other applications but the Go application crashes when it wants a little bit more memory but is denied.

My preferred solution would be a setting on how to handle this situation.

  1. Default. Crash, helps prevent memory leaks. Have a Go service that leaks memory, have it crash and the service manager restart the service.
  2. Halt, keeps long running processes running. When memory can not be allocated, halt all processes and try again after a configurable time.

@artyom
Copy link
Member

artyom commented Feb 19, 2018

@elvarb I believe you can already achieve behavior you described on Linux utilizing cgroups and their oom-related knobs: https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt

@elvarb
Copy link

elvarb commented Feb 19, 2018

@artyom I'm running into this issue on Windows, does something like cgroups exist there?

@cespare
Copy link
Contributor Author

cespare commented Feb 19, 2018

@elvarb what you're describing sounds quite different from the specific setting I'm requesting in this issue. I suggest you open a new issue to discuss it.

@cespare
Copy link
Contributor Author

cespare commented Feb 23, 2018

I ran into this again today with another service that was doing 15 collections/sec. This one is a fairly simple webserver that receives 10-15k requests per second and turns them into protobuf messages that are sent along the network to other systems for processing.

@aclements
Copy link
Member

I think there are two basically orthogonal things going on here.

GC amortization failure

Assuming I understand the problem, I think what's actually going on here is a failure of the runtime to amortize GC costs. Currently, the GC goal (the payment) is in terms of heap size, but the actual GC cost is proportional not just to the heap size, but also to the size of the globals, the sizes of the stacks, and possibly some small fixed cost overhead. At large heaps, the relative contributions of the other factors tends toward zero, but at small heaps they can be a significant portion of the cost. (For the record, we've been here before: #19839 :)

@cespare, for the real situations where you've observed this, I'd love to know how much data those programs have in globals (add the sizes of the .data and .bss segments) and in stacks (roughly runtime.MemStats.StackInuse).

Here's a trivial example program to demonstrate my thinking: https://play.golang.org/p/k69Zo0C7M1F. Here's the measured GC wall clock time on my laptop with GOMAXPROCS=1 (not necessary, but keeps things predictable) with two different sizes of globals as the heap grows:

image

Perhaps more to the point, we can look at the proportional GC cost:

image

The 4MB minimum is meant to truncate away the really, really bad proportional cost at the left, but even with just 10MB of globals that's clearly not enough.

Given this, I propose that we tweak the definition of GOGC to be proportional to the total cost of GC, which includes at least heap, globals, and stacks. For applications with larger heaps, the difference probably won't be noticeable. But I believe this may solve the "small heap problem" without the need for extra knobs or potentially-dangerous rate limiting.

For example, consider a program that has 100MB of live heap and 100MB of globals. In the current scheme, the footprint grows to 300MB total, but GC has to scan 200MB for 100MB of growth, making GC twice as expensive as an equivalent program with 0MB of globals. In the proposed scheme, the footprint would grow to 400MB total, but GC would scan 200MB for 200MB of growth, so the GC cost is identical to the program with 0MB of globals.

SetMaxHeap

Fixing GC amortization may or may not fix your problem (I'll have a better sense if you can measure the globals and stacks). However, based on some of the things you said, I think SetMaxHeap might be a reasonable solution, or I might be picking quotes too carefully. :)

We don't actually care about the GOGC ratio; we want to target a particular heap size.

This sounds like exactly what SetMaxHeap does. If you know how big you want the heap, you can set GOGC to ~infinity (in TeX tradition, say infinity=10000) and put the heap entirely under the control of SetMaxHeap.

We tried a solution that involved a long-running goroutine in every application watching memory use (via runtime.ReadMemStats) and adjusting SetGCPercent.

I think the SetMaxHeap channel would let you do this sort of thing much more effectively. For example, if you're okay with the heap growing, you can use the channel to observe that the heap is under pressure and rather than reducing your application's heap usage (the normal use of the channel) you could raise the heap limit. This is cheap to do (doesn't trigger a GC). And it's okay if there's some lag because it's still a soft limit: in the worst case, the GC will expend some extra cycles trying to keep you under an unnecessary limit, but it's not going to OOM your process.

@cespare
Copy link
Contributor Author

cespare commented Mar 21, 2018

@aclements thanks very much for your detailed consideration.

I pulled some metrics for one of my example programs. Let me know if you need more. (These numbers are after removing my "ballast" workaround.)

Program K:

entity size
.data (as reported by size -A) 69 KB
.bss (as reported by size -A) 135 KB
Heap size (as reported by runtime.MemStats.Alloc) 70 MB - 120 MB
StackInuse 31 MB
# of goroutines ~7300
Allocation rate 650 MB / sec
GC rate 12 collections / sec

Everything you're saying about SetMaxHeap sounds interesting. Should I be trying out CL 46751 and giving feedback for these use cases?

@cespare
Copy link
Contributor Author

cespare commented Mar 21, 2018

From the above, I'm assuming that the .data and .bss sizes are insignificant and that we can point the finger at the stack:heap proportion.

I looked at 3 other programs where we've noticed a high GC rate (not all of these were using enough CPU to warrant addressing) and I can confirm that they all have similar .data/.bss sizes.

@rsc
Copy link
Contributor

rsc commented Apr 23, 2018

Leaving for @aclements. Marking proposal-hold so we don't see it at proposal review but it's really just on hold for Austin to accept when the runtime team is ready.

@cespare
Copy link
Contributor Author

cespare commented Aug 14, 2018

This has been noted elsewhere (don't have the issues handy) but I also want to point out that this "small heap" problem is pretty common when running benchmarks (i.e., with go test -bench ...). This is a scenario where heaps are typically pretty small and so if your benchmark includes allocations you can sometimes end up measuring the cost of doing a huge number of GCs -- much more than the real application sees.

(How to write and interpret benchmarks in the face of GC is a more general -- and difficult -- problem, of course, but benchmarks are the other place where I end up fiddling with GOGC and it would be much easier to explain GOMINHEAP or SetMaxHeap than GOGC=5000 in this context.)

@LK4D4
Copy link
Contributor

LK4D4 commented Sep 18, 2018

In an effort to mitigate #18155 we reduced our heap size by ~90% (using sync.Pool mostly) and now facing this issue when we spend more time in GC cycle than outside of it.
We probably gonna use ballast way, but having SetMaxHeap would be nice.

@CAFxX
Copy link
Contributor

CAFxX commented Dec 19, 2018

Heap ballast

...
Obviously this isn't a great solution for the long term since it wastes memory that could otherwise be used for something else (like disk cache).

This is unrelated to fixing the actual issue but just as a potential way to mitigate the downsides of the workaround: have you considered marking the space used by the ballast as MADV_DONTNEED? (I'm assuming you don't store any data inside the ballast).
This should make the ballast count towards GC calculations while allowing the OS to reuse those ranges for something else (like page caches, or other processes). And if you (or the runtime) ever accesses the contents of the ballast the OS is going to give back on-demand 0-filled pages.

@andybons andybons added this to the Go1.14 milestone Jul 8, 2019
@pierrre
Copy link

pierrre commented Jul 21, 2019

About the memory ballast, is it a valid implementation to declare it as a global variable ?
In the implementations above, it is declared as a "local" variable inside a function, and runtime.KeepAlive() is used.
My goal is to create a simple package that I can import in my applications (I want to test it).

package ballast

import (
	"os"
	"strconv"
)

const envVar = "GO_BALLAST"

var alloc []byte

func init() {
	s, ok := os.LookupEnv(envVar)
	if !ok {
		return
	}
	i, err := strconv.Atoi(s)
	if err != nil {
		panic(err)
	}
	alloc = make([]byte, i)
}

@aclements
Copy link
Member

@pierre, it depends on how you use the global variable. A global variable like var ballast [16<<20]byte does not currently count towards GC scheduling (though really it should; see #23044 (comment)). However, a global variable that contains a pointer to a large heap-allocated object such as a slice that you make in init will work as a ballast.

@cespare
Copy link
Contributor Author

cespare commented Jul 31, 2019

@aclements I don't think I follow the reasoning in #23044 (comment). Once the live heap size is larger than GOGCMIN/2, we exit the GOGCMIN regime, so there's no asymptotic death spiral issue.

To come at it from another angle: today the GC has a minimum heap size of 4 MiB. Essentially, this proposal is simply a way to increase that minimum. Why don't your conditioning concerns apply to the 4 MiB minimum?

I'm concerned that the "max heap" problem is turning out to be really hard, and it seems like it might take a while longer yet to figure out an API for CL 46751. In the meantime, this seemingly-easier problem is going unaddressed, and the "heap ballast" hack is working its way into Go community lore.

That said, I'm not clear on whether my problem can be fixed by more accurate amortization. Are the issues you mentioned in May 2018 (#23044 (comment)) still present? In that comment, you wrote

Fixing GC amortization may or may not fix your problem (I'll have a better sense if you can measure the globals and stacks).

I followed up with the numbers you asked for. What do they indicate to you?

@cespare
Copy link
Contributor Author

cespare commented Oct 16, 2019

@aclements bump on the questions I asked in #23044 (comment).

@rsc
Copy link
Contributor

rsc commented Feb 26, 2020

The API the GC team designed for this issue is #29696.
Closing this one in favor of that one.

@cespare
Copy link
Contributor Author

cespare commented Mar 29, 2020

@rsc @aclements I'm disappointed by the outcome here.

The API the GC team designed for this issue is #29696.

This doesn't make sense to me. Issue #29696 has an API designed by @josharian. It is definitely not "for this issue" -- it seems to be solving a problem related to the interaction of caches and GC, which is unrelated except insofar as the topic is "garbage collection".

I tried to carefully describe the problem we were having, which we have experienced again and again over the years, and I collected supporting data. To review what happened:

@aclements mentioned two separate things, both of which are unresolved.

  1. The GC doesn't amortize its cost on small heaps properly. AFAIK this is a bug and it still exists. @aclements's comment on this now-closed issue is the best description of it. (It may also be tracked by runtime: GC can thrash on very small heaps #22743.)

    • I followed up on this specific issue, providing additional data from production servers and asking whether it validates this hypothesis, here, here, here, and here, and I never got a response.
  2. @aclements suggested that SetMaxHeap may be useful as a workaround, by running my application with a target max heap size and GOGC=off. (Besides this thread, we also discussed this in person once or twice.)

    I tried this out, and it works, but it's definitely not ideal since it requires me to come up with a max heap size when I otherwise wouldn't have to consider that. (I just want the GC to stop running so much on a tiny heap.) The SetMaxHeap API has now been bouncing around for years, and has been internally tested at Google, and it sounds like the experiment has basically failed, because it's so hard to use properly for its intended purpose.

    Meanwhile, the actual problem that I'm having for which SetMaxHeap was only ever a hacky workaround anyway, and which is much simpler to solve, has seemingly been ignored. The conflation of my problem with SetMaxHeap, and with the other problems people have with the GC, has made the conversation on this issue less clear than it could have been.

Finally, a year and a half after I filed this issue, @aclements wrote the first actual argument against my proposal (other than "we don't want to add knobs"), describing how a SetMinHeap API could lead to bad performance characteristics. The critique didn't make sense to me, though, and my follow-up questions were ignored.

And then, after 7 months more of silence, @rsc closed this issue as a duplicate of an unrelated issue in February.

At every step in this communication, the round trip time can be measured in months or quarters.

One of the worst outcomes, I think, is that at this point the Go community has absorbed the idea of "heap ballast" as the standard fix for this problem. (Just see all the links that GitHub attached to this issue automatically, or read the Twitch blog post that has made the rounds several times.) I never imagined, when I wrote about our heap ballast hack in 2017, that three years later it would still be the best solution.

@coder543
Copy link

coder543 commented Oct 28, 2020

It has now been 7 months since @cespare's last comment on this issue. Given the distribution of both discussion and votes, it seems like most people here outside of the core team agree that this is a problem. I was rather hoping @cespare's comment on the confusing nature of the path this issue took to being closed would be addressed or that the Go team would reconsider this idea.

I don't understand why this issue is closed when there have been a number of experience reports that point to the need to use a ballast as a hacky solution in certain scenarios. SetMinHeap (or a more intelligent per-goroutine minimum heap to trigger collection) would seem to be worthwhile.

I also think the original decision to close this issue as a duplicate of #29696 doesn't hold up. How is this a duplicate of #29696? No explanation has been given.

Please reopen this issue.

@ianlancetaylor
Copy link
Contributor

The intent is that #29696 is the place to discuss APIs that programs can use to control the heap. There is a lot of discussion of SetMaxHeap on that issue, and above there are comments that SetMaxHeap can address the problem described here. #29696 has 32 comments on it, so that doesn't seem too overwhelming yet.

@cespare
Copy link
Contributor Author

cespare commented Oct 28, 2020

@ianlancetaylor I don't know why we need a single blanket issue for all possible GC/heap APIs. This issue describes a specific problem admitting straightforward solutions and from my perspective it is the conflation with other things that share some keywords that has totally derailed it.

I described in #23044 (comment) why SetMaxHeap does not seem like a satisfactory solution, but to review:

  1. I don't want to set a max heap, only a min heap, so at best it's a hacky workaround.
  2. There has been a CL for SetMaxHeap for 3.5 years now and AFAICT it may well be another few years before anything actually lands on tip. Apparently max heap size is a thorny problem. But I do not believe min heap size is a thorny problem. (As one data point: there is already a min heap size of 4 MiB; I just want a way to raise it.)

@rsc
Copy link
Contributor

rsc commented Nov 4, 2020

I don't know why we need a single blanket issue for all possible GC/heap APIs.

I can answer this one. Because, in general, if we evaluate every single API in isolation, what you end up with is a LOT of APIs, each of them justifiable on its own but collectively not.

@mknyszek
Copy link
Contributor

mknyszek commented Nov 6, 2020

@cespare We've discussed this issue. It's unfortunately not as simple as just making the existing min heap configurable. Austin alluded to this earlier, but the primary issue is setting a clear line like this causes significant issues with the GC pacer (that is, deciding when to start a concurrent mark phase).

Consider the following example:

  1. An application sets a minimum heap size of 1 GiB, with GOGC=100.
  2. For the most part, the application keeps a live heap of 1 MiB, so the pacer determines that GCs can start fairly late, after a few cycles.
  3. GC cycle N comes along, and as the heap grows to 1 GiB, the amount of live memory grows to 512 MiB (could be more).
  4. When the mark phase for GC N begins, the GC suddenly realizes that 512 MiB is live.
  5. Since the GC has stabilized assuming it only has to scan 1 MiB, GC N starts extremely late and is forced to effectively stop-the-world to avoid massive overshoot.

If the live heap shrinks again, then the pacer will need several cycles to recover and in the meanwhile will be paced very conservatively. The trouble here is a little subtle. Going by GOGC, the system and the user expect that at most 1+GOGC/100x (or in this case 2x) memory is going to be live by the end of GC N the worst case. Yet, here we have 512x the amount of live memory in GC N as the previous GCs. In general, this worst case is unbounded and irrespective of GOGC. One might argue that this same thing could happen if, say, GOGC was set to 51100, but the trick is that the user asked for a CPU/RAM tradeoff of GOGC=100, not 51100. In this sense, a hard minimum may become somewhat of a footgun.

This example is admittedly quite extreme, but I believe it illustrates the unexpected instability introduced by a hard minimum where the live heap is changing underneath it. Note that the existing heap minimum is broken in this exact way. As Austin mentioned earlier, it's a hack to cut off issues with the GC pacer not being aware of goroutine stacks and globals and such (which should be fixed! I'm hoping this work will illuminate that problem a little more since although from the user's perspective they're orthogonal, they do involve the same mechanism). It just happens to work fine because 4 MiB is quite small, so on most modern hardware, the GCs will be so short that the pacer barely plays a role anyway.

There are a number of issues with the pacer (this included!) that suggest various (related) aspects of the pacer's design deserve revisiting. I plan to file a new issue to track that and also explain some of our thinking in more detail, in the interest of further transparency. I'll post back here once I've created it.

@mknyszek
Copy link
Contributor

mknyszek commented Nov 6, 2020

Added this general issue to #42430.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests