x/tools/gopls: completions taking 20x as long since https://github.com/golang/tools/commit/41e4e5654988 #62665

myitcv · 2023-09-15T19:18:20Z

What version of Go are you using (`go version`)?

$ go version
go version go1.20.6 linux/arm64
$ go list -m golang.org/x/tools
golang.org/x/tools v0.13.1-0.20230915141654-e2393aba883a
$ go list -m golang.org/x/tools/gopls
golang.org/x/tools/gopls v0.0.0-20230915141654-e2393aba883a

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN="/home/myitcv/dev/cuelang/bin"
GOCACHE="/home/myitcv/.cache/go-build"
GOENV="/home/myitcv/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/myitcv/gostuff/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/myitcv/gostuff"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/myitcv/gos"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/myitcv/gos/pkg/tool/linux_arm64"
GOVCS=""
GOVERSION="go1.20.6"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/myitcv/dev/cuelang/cue/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2724720507=/tmp/go-build -gno-record-gcc-switches"

What did you do?

git clone https://review.gerrithub.io/cue-lang/cue
git checkout 42ae8e4594a203000488dbaa1d143574fb8588c6
vi cue/types.go:146:10

The line starts as:

func (o *structValue) marshalJSON() (b []byte, err errors.Error) {

Change that to:

func (o *) marshalJSON() (b []byte, err errors.Error) {

Then:

func (o *struc_) marshalJSON() (b []byte, err errors.Error) {

(with the cursor in the _ position, then attempt completion.

What did you expect to see?

Fast results.

What did you see instead?

Slow results.

The following table shows some pretty accurate wall time numbers for the completion call returning. For reference, my machine is an M1 Max 64GB memory, running Linux via VMWare Fusion.

`golang.org/x/tools` Commit	Cold start, no daemon	Same completion, repeated after cold start	Fresh session with daemon running from previous session
golang/tools@`ac2946029`	102ms	102ms	102ms
golang/tools@`41e4e5654988`	2032ms	243ms	2069ms
golang/tools@`e2393aba883a` (current tip)	2172ms	241ms	2213ms

Log files for each result from the first column of timings:

I have bisected this slowdown to golang/tools@41e4e5654988.

This appears to highlight a couple of things to me:

Type checking has slowed down at least in this use case (assuming that is the main factor influencing completion speed).
Something possibly isn't working as expected with a pre-existing daemon instance, which is more easily seen given the previous point.

cc @findleyr

The text was updated successfully, but these errors were encountered:

findleyr · 2023-09-15T19:37:17Z

Thanks for the report. Possibly related: golang/vscode-go#2889.

The CL you bisected to wouldn't have slowed down type checking. However, it would have resulted in many more completion candidates being considered. Before that CL it was possible that gopls miss some struct fields in the current type being selected (!). After that CL we should have no such omissions, but may also pull in additional candidates. The 20x slowdown you observe is likely due to parsing cost for building this candidate list; that cost is mitigated the second time around due to the parse cache.

This was on my radar due to golang/vscode-go#2889, and is among our highest priorities to address.

Based on my current understanding, you should only experience this in certain packages, and should only experience the 20x slowdown at startup, after which subsequent completions should be faster (though perhaps not as fast as before). Can you confirm if that is the case?

CC @adonovan

myitcv · 2023-09-15T19:49:39Z

Thanks.

Do the numbers in the table above not answer your question re subsequent completions?

Note also the slow speeds even with a daemon running which suggests another bug?

findleyr · 2023-09-15T20:02:01Z

Do the numbers in the table above not answer your question re subsequent completions?

Sorry, I mean subsequent completions for other things, not just that one specific source location; during editing do you generally experience a 20x slowdown in completion, or 2.5x slowdown? The table suggests 2.5x, but I wanted to be sure.

Note also the slow speeds even with a daemon running which suggests another bug?

Not a bug, but an oversight: the parse-cache is per-session, not per instance -- the parse cache is meant to be a minor optimization, not the massive optimization that is experienced here. This is actually strong evidence that the parse cache is involved.

Interestingly, the bug does not immediately reproduce for me in neovim or vs code (I see completions around 100ms), though it does reproduce for Alan. We're investigating why.

myitcv · 2023-09-15T20:08:11Z

Sorry, I mean subsequent completions for other things, not just that one specific source location; during editing do you generally experience a 20x slowdown in completion, or 2.5x slowdown? The table suggests 2.5x, but I wanted to be sure.

Truth be told I had to revert to the fast version because I was getting too frustrated :)

So unfortunately I don't have any evidence or examples beyond the analysis presented here.

findleyr · 2023-09-15T20:50:58Z

So, we have new evidence, but we lost our reproducer and could use your help.

On Alan's machine, we started gopls with -debug=localhost:8099 and visited http://localhost:8099/trace. There, we observed (by clicking on textDocument/completion) that the longest completion was ~2.6s. Scrolling down to look at the trace, we saw that ~2.5s of that was in "runProcessEnvFunc".

This is the unimportedCompletion logic scraping the module cache. Unfortunately, we deleted our backup of the huge mod cache, and can no longer reproduce 🤦.

Could you do the following, using gopls@tip (or at the culprit)?

start gopls with -debug=localhost:8099
reproduce, and look at http://localhost:8099/trace. Please share the "Longest" trace for completion, which you can see by clicking on textDocument/completion and scrolling down
try setting "completionBudget": "1ms" to see if it makes a difference (EDIT: not 0s, as 0 => infinity)
try setting "completeUnimported": false to see if it makes a difference

Sorry to enlist you, but absent a reproducer this is hard to track down. The steps above will help confirm our theory.

myitcv · 2023-09-16T05:37:20Z

So, we have new evidence, but we lost our reproducer and could use your help.

Great, happy to help.

All of my response is based on analysis using golang/tools@866a6b0ff32a.

start gopls with -debug=localhost:8099

reproduce, and look at http://localhost:8099/trace. Please share the "Longest" trace for completion, which you can see by clicking on textDocument/completion and scrolling down

This is what I get. Although for some reason I don't have the "runProcessEnvFunc" span details:

    05:40:11.400 end textDocument/completion (+2.073471033s) method="textDocument/completion" direction="in" id="#2"
        05:40:11.400 event (+2.073470075s) label= status.code="OK" 
        05:40:09.327 start queued
        05:40:09.327 end queued (+51.833µs)
        05:40:09.327 start textDocument/completion method="textDocument/completion" direction="out" id="#3"
        05:40:11.400 end textDocument/completion (+2.072909241s) method="textDocument/completion" direction="out" id="#3"
            05:40:11.400 event (+2.073332116s) label= status.code="OK"

try setting "completionBudget": "1ms" to see if it makes a difference (EDIT: not 0s, as 0 => infinity)

Setting to "1ms" did not influence the completion time:

[Trace - 05:43:19.927 AM] Received response 'textDocument/completion - (2)' in 2026ms.

This feels like a separate bug?

try setting "completeUnimported": false to see if it makes a difference

Bingo (note I reverted the completionBudget setting, i.e. no value set in my config):

[Trace - 05:44:32.173 AM] Received response 'textDocument/completion - (2)' in 24ms.

Which triggers an interesting thought for me.

I would absolutely not have expected the edit and completion that I'm performing in this situation to have triggered unimported completions. Because they are so expensive, I would only have expected them to trigger in the following scenario:

blah._

with the cursor at the position _, where blah is calculated to be unresolved in the current scope and hence it is likely that blah. is an attempt to start and now complete a qualified identifier.

Otherwise, a trivial completion attempt in a "hello world" example:

package main

func main() {
	str_
}

takes ~2000ms from cold and ~200ms warm (same session, ignoring the daemon point for now).

Re questions about the VM setup, last time I checked my disk performance through VMWare Fusion was faster than native macOS. Incidentally, you might find that macOS (and windows?) users have been disproportionately reporting more speed problems if unimported completions are dominating as we think they are, especially where those users have large module caches. Might also be worth asking people to report the size of their mod cache for reference, unless you could also lazily compute and report this in log files.

Here is mine:

$ du -sh $(go env GOMODCACHE)
36G     /home/myitcv/gostuff/pkg/mod
$ du -sh $(go env GOMODCACHE)/cache
7.9G    /home/myitcv/gostuff/pkg/mod/cache

(because as I understand it the cache subdirectory does not participate in the unimported scan, because it contains zip files and bare git trees).

In summary from my perspective:

Unimported completions should only be triggered when there is probably a qualified identifier where the package name cannot be resolved. If this goes against the grain of what most people want (for whatever reason) then I think it's reasonable to stick this behind a config option.
Where unimported completions are triggered, then ideally the cache walk is computed ahead of time async and persists with remote gopls instances: x/tools/gopls: unimported completion delay, random results and caches not persisted with remote? #47615. This lack of caching between sessions in the presence of a daemon is, I guess, also the likely cause of the timings I've seen above?
There appears to be a bug where the completionBudget is not honoured, per above.

findleyr · 2023-09-16T15:24:55Z

Thanks for your response. The bug is understood now. Because of the change to how completion budget is implemented, a check for context cancellation that would have prevented unimported completion from running was eliminated.

The unfortunate/problematic thing is that this existed for multiple months without being caught. We do have several benchmarks for completion, but they were insensitive to this change because they don't capture the initial cost and because they operate in a fresh module cache. We've seen a few reports of slow completion our survey results and issues, and I had it on my list to revisit before the next release, but it would have been hard to hone in on this without your clear reproducer. Thank you again.

I'll note that this is a perfect example of where telemetry would have highlighted the regression -- we would have seen the regression in the release, and may even have root-caused it quickly by observing that it affected darwin more than linux.

A few comments:

That 1ms completion budget doesn't cause completion to immediately return is half bug, but also half WAI. The bug is what we've discovered here: certain operations that were previously skipped are no longer being skipped. However, even once these are fixed, setting completionBudget to 1ms won't cause completion to immediately return with empty results. AFAIK, the budget never worked that way when set to a target faster than type checking, which is typically 10-100ms, and I think that's OK. It is a soft latency target, not a hard deadline.
This would all be much less of an issue if we could stream completion results, but VS Code does not support this client-side (see also Add streaming support in vscode microsoft/vscode#105870 (comment)).
Unimported completions runs when completing identifiers because it completes to package names. This is moderately useful, but not worth making completion 10x slower.
We could of course avoid running unimported completion in receiver context, but see the next bullet point.
Unimported completion should not be this slow. There's no compelling reason for this to be the case; unimported packages changes infrequently if at all, and since we only care about exported names we can be much less precise.

You can expect that this will be fixed, and soon. In addition to simply fixing the cancellation logic, there appears to be a lot of low-hanging fruit that can make completion significantly faster.

The complicated part is not the fix itself, but rather the process that led to it.

myitcv · 2023-09-16T16:35:55Z

Thank you for such a wonderfully detailed, comprehensive response. Delighted to help in whatever small way I can, even if that is just trying different scenarios etc.

gopherbot · 2023-09-21T00:23:56Z

Change https://go.dev/cl/530015 mentions this issue: gopls/internal/regtest/bench: add additional completion benchmarks

gopherbot · 2023-09-21T00:23:56Z

Change https://go.dev/cl/530016 mentions this issue: gopls/internal/lsp/source/completion: start timing before type checking

gopherbot · 2023-09-21T18:51:02Z

Change https://go.dev/cl/530215 mentions this issue: gopls: instrument telemetry for latency of important operations

gopherbot · 2023-09-21T20:51:40Z

Change https://go.dev/cl/530017 mentions this issue: gopls/internal/lsp/source/completion: fixes for completion budget

Add instrumentation for the telemetry described in golang/go#63129, along with a test. Also isolate the existing telemetry regtest. Updates golang/go#63129 Updates golang/go#62665 Change-Id: If641b44d430eaaf96c469c804f42fd72cd821bed Reviewed-on: https://go-review.googlesource.com/c/tools/+/530215 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>

Our completion benchmarks were too narrowly focused on type checking speed, and did not provide adequate coverage of unimported completion or completion budget. Run the CompletionFollowingEdit benchmarks with unimported completion enabled, and with a budget of 100ms. The former still seems to have little impact on performance, likely due to my environment, whereas the latter demonstrates a stark regression: with gopls@v0.11, completion timing is bang-on 100ms, whereas with gopls@tip timing is largely unaffected by the budget. Also add a completion test in identifier context, as this exercises different code paths, and further highlights the regression in budget implementation. For golang/go#62665 Change-Id: I8b771225c7e2979ae1eb9e67e79b9b6165e2712b Reviewed-on: https://go-review.googlesource.com/c/tools/+/530015 Reviewed-by: Alan Donovan <adonovan@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>

Partially revert CL 503016, to go back to starting the completion timer before type checking. As discussed in slack and golang/go#62665, even if this leads to inconsistent behavior across various LSP clients (due to order of requests) we should still do our best to interpret the completion budget from the user perspective. For golang/go#62665 Change-Id: I2035e2ecb7776cead7a19bd37b9df512fdbf3d17 Reviewed-on: https://go-review.googlesource.com/c/tools/+/530016 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Alan Donovan <adonovan@google.com>

Address a few considerations that were overlooked in CL 503016: - don't run goimports callbacks if the budget has expired - don't use a min depth for the second call to deepSearch For golang/go#62665 Change-Id: I23e2a3154049eabaff14c9c5071e7eea26e7c619 Reviewed-on: https://go-review.googlesource.com/c/tools/+/530017 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Alan Donovan <adonovan@google.com>

findleyr · 2023-09-25T20:44:59Z

@myitcv I believe gopls@master should now have similar latency to what you observed in April. Please let me know if it doesn't.

However, I'm not closing this issue because there appeared to be a lot of low-hanging fruit to further optimize significantly. For example, I analyzed one of our completion benchmarks on Kubernetes, and observed that type checking was only 65ms, while completion took 170ms. I think we can get completion latency much closer to type checking latency.

Coincidentally, the potential optimizations I see are almost identical to what we did for workspace/symbol requests:

Add concurrency: completion is single-threaded at present.
Avoid pre-building strings, by performing fuzzy match before assembling the fully qualified symbol path.
Use a faster fuzzy matcher (and fix ranking edge cases)
Speed up type inference, somehow. (Not sure if this is possible, but often there are ways to pre-filter candidates)

myitcv · 2023-09-26T04:52:57Z

I believe gopls@master should now have similar latency to what you observed in April. Please let me know if it doesn't.

Thanks, just took a look.

Still seeing similar numbers to before (golang/tools@e2393aba883a) actually. ~20x slow down for the initial completion, and ~2.5x slow down for "warm" completions:

`golang.org/x/tools` Commit	Cold start, no daemon	Same completion, repeated after cold start	Fresh session with daemon running from previous session
golang/tools@`ac2946029`	102ms	102ms	102ms
golang/tools@`fb7463ac1de5`	2032ms	243ms	2069ms

Do I need specific settings to make this "work"?

(I tried setting completionBudget = 100ms but that did not affect things)

One of the big pain points here is that ~2s at startup, with or without a daemon already running. Vim is just one of those tools that you open and close the whole time, so the daemon-based setup is critical.

That said, the ~230ms completion is very noticeable vs ~100ms too.

Coincidentally, the potential optimizations I see are almost identical to what we did for workspace/symbol requests:

Did my suggestions in #62665 (comment) also make sense with regards to optimisations? Because whilst a different kind of optimisation, it feels like we want to avoid unimported completions as much as possible, and when "forced" to include them ensure that we have warm caches in the daemon (as you previously observed).

gopherbot · 2023-09-26T22:14:02Z

Change https://go.dev/cl/530599 mentions this issue: gopls: set a context deadline after minimal completion results

Previously we avoided using context cancellation for implementing completion budget, because it can result in accidentally omitted results (because e.g. constructing of completion items was canceled). But once we have a minimum required set of results, it is OK to use context cancellation to more strictly enforce the completion budget. For golang/go#62665 Change-Id: Ibd10e1f70a396567f8e095bd6824c637cf973518 Reviewed-on: https://go-review.googlesource.com/c/tools/+/530599 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Peter Weinberger <pjw@google.com>

findleyr · 2023-09-27T17:11:41Z

@myitcv can you try again at tip? Sorry to make you keep trying, but I can't reproduce and want to make sure we've successfully rolled back the regression before including additional optimizations.

Did my suggestions in #62665 (comment) also make sense with regards to optimisations? Because whilst a different kind of optimisation, it feels like we want to avoid unimported completions as much as possible, and when "forced" to include them ensure that we have warm caches in the daemon (as you previously observed).

You mean the suggestion not to run unimported completions in various contexts where it doesn't make sense? That's definitely a good idea, but I've concluded that it's not at the heart of the problem you observe (see below).

Upon staring at the code and experimenting, I've drawn the following conclusions:

"normal" completion (meaning when "deepCompletion": false is set) is generally pretty fast following type checking.
There are a lot of places where deep completion could be optimized, as cited in x/tools/gopls: completions taking 20x as long since https://github.com/golang/tools/commit/41e4e5654988 #62665 (comment), however, given that deep completion only adds marginally improved results, the user-facing benefit of these optimizations would not be as large as I'd originally hoped.
The logic of "normal" completions and "deep" completions is a little entangled, so it's hard to change one without changing the other. It would be nice to refactor and make this delineation clearer.

Therefore, I think we should close this bug once we've verified a fix for the regression, and open new bugs for various completion logic improvements, which would constitute a larger project.

myitcv · 2023-09-27T19:05:54Z

Good news! Tested against golang/tools@4b34fbf5f10d and we appear to be back in business!

`golang.org/x/tools` Commit	Cold start, no daemon	Same completion, repeated after cold start	Fresh session with daemon running from previous session
golang/tools@`ac2946029`	102ms	102ms	102ms
golang/tools@`fb7463ac1de5`	2032ms	243ms	2069ms
golang/tools@`4b34fbf5f10d`	102ms	102ms	102ms

Thanks very much for digging into this, @findleyr.

I'll leave you to close the issue or keep open as you see fit.

findleyr · 2023-09-27T19:25:07Z

Fantastic, thanks for confirming. I've opened #63263 to follow up on deep completion performance.

mvdan · 2023-09-27T20:43:52Z

I was sat next to @myitcv earlier when he upgraded gopls and gave it a try - you should have seen his grin from ear to ear!

myitcv added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. gopls Issues related to the Go language server, gopls. labels Sep 15, 2023

gopherbot added the Tools This label describes issues relating to any tools in the x/tools repository. label Sep 15, 2023

gopherbot added this to the Unreleased milestone Sep 15, 2023

findleyr modified the milestones: Unreleased, gopls/v0.14.0 Sep 15, 2023

findleyr added gopls/performance Issues related to gopls performance (CPU, memory, etc). gopls/completion Issues related to auto-completion in gopls. labels Sep 15, 2023

This was referenced Sep 18, 2023

x/tools/gopls: strange preselected completion for unimported package #62560

Closed

x/telemetry/config: add gopls/*/latency histograms #63129

Open

findleyr mentioned this issue Sep 22, 2023

x/tools/gopls: gopls becomes unresponsive from time to time #61969

Closed

findleyr mentioned this issue Sep 27, 2023

x/tools/gopls: optimize deep completion #63263

Open

findleyr closed this as completed Sep 27, 2023

findleyr self-assigned this Sep 27, 2023

findleyr mentioned this issue Sep 28, 2023

x/tools/gopls: express completion scoring in terms of down-ranking factors #63282

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/tools/gopls: completions taking 20x as long since https://github.com/golang/tools/commit/41e4e5654988 #62665

x/tools/gopls: completions taking 20x as long since https://github.com/golang/tools/commit/41e4e5654988 #62665

myitcv commented Sep 15, 2023

findleyr commented Sep 15, 2023

myitcv commented Sep 15, 2023

findleyr commented Sep 15, 2023

myitcv commented Sep 15, 2023

findleyr commented Sep 15, 2023 •

edited

myitcv commented Sep 16, 2023

findleyr commented Sep 16, 2023

myitcv commented Sep 16, 2023

gopherbot commented Sep 21, 2023

gopherbot commented Sep 21, 2023

gopherbot commented Sep 21, 2023

gopherbot commented Sep 21, 2023

findleyr commented Sep 25, 2023

myitcv commented Sep 26, 2023

gopherbot commented Sep 26, 2023

findleyr commented Sep 27, 2023

myitcv commented Sep 27, 2023

findleyr commented Sep 27, 2023

mvdan commented Sep 27, 2023

x/tools/gopls: completions taking 20x as long since https://github.com/golang/tools/commit/41e4e5654988 #62665

x/tools/gopls: completions taking 20x as long since https://github.com/golang/tools/commit/41e4e5654988 #62665

Comments

myitcv commented Sep 15, 2023

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

findleyr commented Sep 15, 2023

myitcv commented Sep 15, 2023

findleyr commented Sep 15, 2023

myitcv commented Sep 15, 2023

findleyr commented Sep 15, 2023 • edited

myitcv commented Sep 16, 2023

findleyr commented Sep 16, 2023

myitcv commented Sep 16, 2023

gopherbot commented Sep 21, 2023

gopherbot commented Sep 21, 2023

gopherbot commented Sep 21, 2023

gopherbot commented Sep 21, 2023

findleyr commented Sep 25, 2023

myitcv commented Sep 26, 2023

gopherbot commented Sep 26, 2023

findleyr commented Sep 27, 2023

myitcv commented Sep 27, 2023

findleyr commented Sep 27, 2023

mvdan commented Sep 27, 2023

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

findleyr commented Sep 15, 2023 •

edited