Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/telemetry/cmd/gotelemetry: bring back 'local' vs 'off' options #63832

Closed
hyangah opened this issue Oct 30, 2023 · 17 comments
Closed

x/telemetry/cmd/gotelemetry: bring back 'local' vs 'off' options #63832

hyangah opened this issue Oct 30, 2023 · 17 comments
Assignees
Milestone

Comments

@hyangah
Copy link
Contributor

hyangah commented Oct 30, 2023

In #63143 we removed the original 'off' option and changed gotelemetry off means "turning off upload". The motivation was to simplify the command interface. Unfortunately, this leads to confusion - users expect to turn off statistics logging completely by doing so.
Moreover, we learned that there are certain cases where some users want to avoid extra disk writes if they are not critical for gopls' core functionalities.

We propose to restore the original 'local' and 'off' distinction.

  • gotelemetry on - collect and submit data to telemetry.go.dev.
  • gotelemetry local - collect data.
  • gotelemetry off - do not collect data.

The default remains 'local'.
The submission of telemetry data is still opt-in. The collection of telemetry data is opt-out like go toolchain's metadata. Data collected before gotelemetry on is not included in submission.

Related - gotelemetry clean shouldn't delete the mode file, but can delete all other files.

@gopherbot gopherbot added the telemetry x/telemetry issues label Oct 30, 2023
@gopherbot gopherbot added this to the Unreleased milestone Oct 30, 2023
@hyangah hyangah modified the milestones: Unreleased, gopls/v0.14.2 Oct 30, 2023
@findleyr findleyr changed the title x/telemetry/cmd/gotelemetry: bring back 'local' vs 'off' options Proposal: x/telemetry/cmd/gotelemetry: bring back 'local' vs 'off' options Oct 30, 2023
@findleyr
Copy link
Contributor

Thanks for filing this, @hyangah.

To summarize: in #63143 we consolidated the API to off|on, with local collection always on, because we didn't think there was any reason to avoid recording telemetry data locally -- in most ways telemetry data is similar to other metadata we write to the file system. We've since heard from multiple users that the mere existence of telemetry data (even if it will never be uploaded) can set off policy alerts in some organizations. Furthermore, the fact that gotelemetry off doesn't turn off this local data collection can be confusing.

While in most circumstances we don't think there should be any concerns about local data collection, users shouldn't have to e.g. fight for an exemption in environments where, as a matter of policy, such data collection is prohibited.

Therefore, it seems simplest to have gotelemetry off disable everything, even local collection, and reintroduce the concept of a local mode, which can be selected with gotelemetry local. The default should be local, as we still believe that local collection should not be a problem in most environments.

@doggedOwl
Copy link

doggedOwl commented Oct 31, 2023

I'm very sad to see that still opt-out is being considered as an acceptable approach even for just local data collection, even without sending them. This still forces us to tightly control go installations and it is no small burden.

The main issue with opt-out local collection is that it still creates the potential for problems when/if someone then enables gotelemetry even if approved because now we are not sure that only data from the moment of approval is being sent on. potentially that will send telemetry data collected before the approval.

Opt-out is such a burden and even ethical issue that you are forcing on all European Companies. (I would say it's a ethical issue everywhere but it seems that at least in the US everyone is resigned and Google and others are taking this desperate resignation as a signal that is ok to collect their data without consent.)

@hyangah
Copy link
Contributor Author

hyangah commented Oct 31, 2023

@doggedOwl I updated the proposal to clarify - data collected before gotelemetry on is never included in the submission.

The data on disk helps users make informed decision before considering to opting-in. Users who decide not to opt-in for submission can continue to use the data to inspect and debug toolchain performance/correctness issues they experience, like other metadata.

@doggedOwl
Copy link

That clarification goes a long way. Still an opt-in aproach would be preferable even with this guarantee. gotelementry off would be better as default. Then the flow becomes: We enable gotelemetry local to verify what data is being collected and then at a later stage gotelemetry on.
There is no downside to a full optin aproach.

@seankhliao seankhliao changed the title Proposal: x/telemetry/cmd/gotelemetry: bring back 'local' vs 'off' options proposal: x/telemetry/cmd/gotelemetry: bring back 'local' vs 'off' options Oct 31, 2023
@rsc
Copy link
Contributor

rsc commented Nov 1, 2023

@doggedOwl, I appreciate your point of view, but there are good reasons to have counters and crashes written to disk locally. Among them:

  • When a problem happens, users can still consult their local data to see what's out of the ordinary.
  • When we ask users to turn on telemetry, the data is there for them to view and make an informed decision right then, without making it a two-step process.

I will also reemphasize once again that even when 'gotelemetry local' is set and the user changes it to 'gotelemetry on', we never upload the data that was collected in 'local' mode. There is no harm to this data being on local disk, and at least two real benefits.

Also, the go command already writes small amounts of metadata to places like the Go build cache that it uses for things like cleaning up the cache, not to mention the build cache itself as well as the module cache and module cache indexes. The local statistics are simply additional files maintained while running the go command, local to the machine like all the other files.

We understand that not everyone will agree with the default being "write counters and crashes to local disk with no uploading". Some people want "write nothing" and some people think it should be "upload by default". There is nothing we can do that will please everyone. We believe a default of "gotelemetry local" strikes the best balance.

@sbromberger
Copy link

sbromberger commented Nov 2, 2023

There is no harm to this data being on local disk

There is when policy dictates that things stored on the filesystem are subject to remanence and disclosure policies, and that as a result only data that needs to be stored should be. There is a real cost associated with this decision.

Also, the go command already writes small amounts of metadata to places like the Go build cache that it uses for things like cleaning up the cache, not to mention the build cache itself as well as the module cache and module cache indexes

This data is necessary for the operation of the tool. It is also stored in a directory that is designed to hold ephemeral data, so it can be removed without issue and excluded by name from backups. In contrast the telemetry data is not (currently!) necessary for the correct and intended operation of the tool; it serves a few purposes but statistics-gathering for the benefit of an outside organization may not be a valid one as far as some of the more strict systems security auditors are concerned.

That said, I think that local is a reasonable default. Those of us who don't want / can't have this telemetry data stored are in the distinct minority and as long as there is an option that allows us to use the tool with full functionality without storing the data at issue, an opt-in approach doesn't seem to me to be too unreasonable.

@rsc
Copy link
Contributor

rsc commented Nov 2, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

Currently there are two gotelemetry settings:

  • off means write counters and crashes locally (default)
  • on means report counters and crashes from after the setting was changed to “on”

The proposal is to rename “off” to “local”, keeping it the default, and then introduce a new “off” that means don’t write counters and crashes at all.

@hyangah
Copy link
Contributor Author

hyangah commented Nov 3, 2023

I want to clarify one more detail. We will still need to store metadata in the telemetry directory for gotelemetry off - to record user's choice, etc. It's not a bad idea to include them in backup.

@sbromberger
Copy link

We will still need to store metadata in the telemetry directory for gotelemetry off - to record user's choice, etc. It's not a bad idea to include them in backup.

This is a good thing - it will allow folks to scan that file to confirm the setting is appropriate for the environment.

@findleyr findleyr self-assigned this Nov 6, 2023
@findleyr
Copy link
Contributor

findleyr commented Nov 9, 2023

@rsc @adonovan -- any update on this from the proposal committee yesterday?

@gopherbot
Copy link

Change https://go.dev/cl/541376 mentions this issue: all: revert to telemetry mode values (on|off|local)

@adonovan
Copy link
Member

This proposal was accepted during the review meeting, and the decision is recorded in the minutes. I'm not sure why the automatic updates haven't gone out yet--perhaps Russ's travel plans have delayed things.

@rsc rsc changed the title proposal: x/telemetry/cmd/gotelemetry: bring back 'local' vs 'off' options x/telemetry/cmd/gotelemetry: bring back 'local' vs 'off' options Nov 10, 2023
@rsc
Copy link
Contributor

rsc commented Nov 10, 2023

The minutes have now posted (and would have commented here except the state was already changed). Thanks!

gopherbot pushed a commit to golang/telemetry that referenced this issue Nov 13, 2023
This CL is a partial revert of CL 530436, adding back the 'local'
telemetry mode, which allows local collection but not uploading. The new
meaning of "off" is to disallow both collection and uploading.

For golang/go#63832

Change-Id: Iae7e537e31beef185c27f3877ff23dbae1df6dfd
Reviewed-on: https://go-review.googlesource.com/c/telemetry/+/541376
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Peter Weinberger <pjw@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/542055 mentions this issue: internal/counter: don't write counters to disk if mode=off

gopherbot pushed a commit to golang/telemetry that referenced this issue Nov 14, 2023
It was decided in golang/go#63832 that when the telemetry mode is "off"
no telemetry data should be written to the file system. However, the
current implementation of "off" still creates the counter file -- it
just doesn't produce any reports.

Fix this to not even create the counter file when the mode is off. This
was rather tricky to implement, as it required auditing a lot of code to
see that we don't reach openMapped. However, per discussion with pjw@ it
should be sufficient to implement this check in Open, as is done here.

This broke some tests because
1. some tests were reading the wrong mode file (fixed by setting
   telemetry.ModeFile in counter_test.go)
2. the E2E tests were incorrectly escaping the RunProg test.run regexp
   (fixed by simplifying the RunProg logic)

For golang/go#63832

Change-Id: I47066a97a8fd17c4c2be776077d718e9cdbaf65a
Reviewed-on: https://go-review.googlesource.com/c/telemetry/+/542055
Auto-Submit: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Peter Weinberger <pjw@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/542317 mentions this issue: gopls: upgrade x/telemetry and account for new mode logic

gopherbot pushed a commit to golang/tools that referenced this issue Nov 14, 2023
Upgrade x/telemetry to pick up the new (on|off|local) mode logic. In
this new schema, when the mode is explicitly "off" we should assume that
the user doesn't want to be prompted about enabling telemetry. Update
our internal logic and tests accordingly.

For golang/go#63832

Change-Id: I7b9c0840c48c680110ffa84c59bce2d5249942dd
Reviewed-on: https://go-review.googlesource.com/c/tools/+/542317
Reviewed-by: Peter Weinberger <pjw@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link

Change https://go.dev/cl/542156 mentions this issue: gopls: upgrade x/telemetry and account for new mode logic

gopherbot pushed a commit to golang/tools that referenced this issue Nov 14, 2023
Upgrade x/telemetry to pick up the new (on|off|local) mode logic. In
this new schema, when the mode is explicitly "off" we should assume that
the user doesn't want to be prompted about enabling telemetry. Update
our internal logic and tests accordingly.

For golang/go#63832

Change-Id: I7b9c0840c48c680110ffa84c59bce2d5249942dd
Reviewed-on: https://go-review.googlesource.com/c/tools/+/542317
Reviewed-by: Peter Weinberger <pjw@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
(cherry picked from commit 944d4e7)
Reviewed-on: https://go-review.googlesource.com/c/tools/+/542156
Reviewed-by: Alan Donovan <adonovan@google.com>
@findleyr
Copy link
Contributor

This behavior has now been released in gopls@v0.14.2, so this is done.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Nov 19, 2023
This release contains just one change: an upgrade of x/telemetry
to pick up support for the "local" telemetry mode (golang/go#63832).

Previously, when the telemetry mode was "off" (the default), counter
data would not be uploaded, but would be written to the
os.UserConfigDir()/go/telemetry/local directory of the local file
system. We heard from a few users that, as a matter of policy within
their organization, they need a way to prevent even this local data
from being written. With this release, running gotelemetry off will
stop gopls from writing this local counter data. Note that the
os.UserConfigDir()/go/telemetry/mode file must be written to record
the "off" state.

The new default telemetry mode is "local", which behaves the same
way as "off" did before. In "local" mode, counter data is written
to the local file system, but not uploaded. Local data can be
inspected with the gotelemetry view command.

See golang/go#63832 for more details. Thanks again for helping us
support transparent telemetry in gopls. As described in the v0.14.0
release notes, we are confident that this data will help us produce
a better, faster, more reliable product. In fact this is already
happening.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

7 participants