Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all: add opt-in transparent telemetry to Go toolchain #58894

Open
rsc opened this issue Mar 6, 2023 · 27 comments
Open

all: add opt-in transparent telemetry to Go toolchain #58894

rsc opened this issue Mar 6, 2023 · 27 comments

Comments

@rsc
Copy link
Contributor

rsc commented Mar 6, 2023

In February I posted a series of blog posts defining Transparent Telemetry, and we had a lively discussion on #58409. In the original posts, the design was opt-out. Based on that discussion as well as private discussions with long-time contributors and users, I revised the design to be opt-in.

I propose that we add opt-in transparent telemetry to the Go toolchain as described in those posts, specifically “The Design of Transparent Telemetry.” Transparent Telemetry has the following key properties:

  • The decisions about what metrics to collect are made in an open, public process.
  • The collection configuration is automatically generated from the actively tracked metrics: no data is collected that isn’t needed for the metrics.
  • The collection configuration is served using a tamper-evident transparent log, making it very difficult to serve different collection configurations to different systems.
  • The collection configuration is a cacheable, proxied Go module, so any privacy-enhancing local Go proxy already in use for ordinary modules will automatically be used for collection configuration.
  • Uploaded reports only include total event counts over a full week, not any kind of time-ordered event trace.
  • Uploaded reports do not include user IDs, machine IDs, or any other kind of ID.
  • Uploaded reports only contain strings that are already known to the collection server: counter names, program names, and version strings repeated from the collection configuration, along with the names of functions in specific, unmodified Go toolchain programs for stack traces. The only types of non-string data in the reports are event counts, dates, and line numbers.
  • IP addresses exposed by the HTTP session that uploads the report are not recorded with the reports.
  • Thanks to sampling, only a constant number of uploaded reports are needed to achieve a specific accuracy target, no matter how many installations exist. Specifically, only about 16,000 reports are needed for 1% accuracy at a 99% confidence level. This means that as new systems are added to the system, each system reports less often. Exactly how often will depend on how many systems opt in.
  • The aggregate computed metrics are made public in graphical and tabular form.
  • The full raw data as collected is made public, so that project maintainers have no proprietary advantage or insights in their role as the direct data collector.
  • The system is off by default and requires an explicit opt-in.

Please note, as described in the Why Telemetry? section of the intro post, that telemetry addresses a different kind of problem than bug reports and surveys. In particular, relying on bug reports is not sufficient to identify problems that don't obviously impact functionality, including performance problems, and surveys are not sufficient to identify the variety of usage and contexts where Go is used and which would inform prioritization of effort.

There is good reason to believe that with even tens of thousands of users opted in, we should be able to get helpful data. It will not be as complete as the opt-out system, but it should be good enough. As described in the Can We Still Make Good Decisions? section of the opt-in post, there will be certain biases in the data based on who is more likely to opt in. Once we have data, it would make sense to compare “technical demographics” like operating system and editor against the annual Go survey and Stack Overflow surveys. If there is significant skew, we could look into reweighting the data as standard polls do (https://en.wikipedia.org/wiki/Iterative_proportional_fitting).

For examples of the kinds of questions we'd use telemetry to answer and the kinds of decisions those answers would inform (but not decide directly), see “Use Cases for Transparent Telemetry”.

@rsc rsc added this to the Proposal milestone Mar 6, 2023
@rsc
Copy link
Contributor Author

rsc commented Mar 8, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@willfaught
Copy link
Contributor

@rsc I didn't get a chance to ask this before the discussion was closed:

Uploaded reports only contain strings that are already known to the collection server: counter names, program names, and version strings repeated from the collection configuration, along with the names of functions in specific, unmodified Go toolchain programs for stack traces.

Why not assign every known string a number, and then use those numbers instead of sending strings? While sending the strings themselves is equivalent to the numbers, there's a "slippery slope" whiff I get from it. Publishing the known string index is another way to assure users that their privacy is being protected. There's an extra feeling of security knowing that only numbers are being sent.

@Merovius
Copy link
Contributor

Merovius commented Mar 9, 2023

@willfaught I actually think it would be less transparent, not more. Because instead of seeing the actual human-readable information that is sent, you get a bunch of cryptic numbers that you have to interpret. Even if it was increasing a "feeling of privacy", there is a good argument to be made that giving such a feeling without actually increasing privacy (which this wouldn't do) is a bad thing. This would be privacy theater, if you will. Combine that with the fact that this would be significantly more churn and work - you either have the mapping be automatic, in which case it might completely re-number everything on every Go release, or you have to manually keep it up to date whenever something changes - and this seems an overall pretty bad idea.

@alvarolm
Copy link

alvarolm commented Mar 9, 2023

Thank you for:

allowing anyone to participate in this this conversation
listening to the community
making improvements based on public feedback

So far I see a thoughtfully and polished proposal with user privacy as a fundamental guideline.
Currently I'm prone to contribute and participate opting in, as this is a clear and reasonable way to help furthering the language development.

I trust the spirit of the proposal that safeguards the balance between privacy and functionality is preserved and never broken.

I'm making these comments as simple user who has chosen to use the language for work and personal use for many years now, not having any significant reason to doubt or question this choice.

Grateful for the work being done and the direction where the language is going.

@willfaught
Copy link
Contributor

I actually think it would be less transparent, not more. Because instead of seeing the actual human-readable information that is sent, you get a bunch of cryptic numbers that you have to interpret.

Most of the data sent will be counters/numbers anyway. The format is binary, if I remember correctly, so it can be mmap'ed. This wouldn't affect the human readability of the data either way.

Even if it was increasing a "feeling of privacy", there is a good argument to be made that giving such a feeling without actually increasing privacy (which this wouldn't do) is a bad thing. This would be privacy theater, if you will.

Avoiding a slippery slope isn't theater.

Combine that with the fact that this would be significantly more churn and work - you either have the mapping be automatic, in which case it might completely re-number everything on every Go release, or you have to manually keep it up to date whenever something changes - and this seems an overall pretty bad idea.

In my opinion, there isn't much widespread sympathy for having to do more work to do the right, best thing in terms of privacy. This change in direction happened because the community insisted that the Go team do the harder thing. Russ acknowledged that this change will require more work:

For all these reasons, I've revised the design to be opt-in (default off). Doing so does introduce at least two costs: first, we must run an ongoing campaign to educate users about opting in is a good choice for them, and second, with fewer systems reporting, the telemetry cost imposed on any particular user is higher.

So I don't think it's necessarily relevant how hard it is, not without a careful tradeoff analysis. And I doubt it would entail as much hassle as you make it out to be, once a process is in place. For instance, the index can be immutable, and added to over time and releases with an AST scan. There may indeed be a good reason not to do this, but I'm not convinced that this is one of them.

@thepudds
Copy link
Contributor

thepudds commented Mar 9, 2023

Most of the data sent will be counters/numbers anyway. The format is binary, if I remember correctly, so it can be mmap'ed. This wouldn't affect the human readability of the data either way.

Hi @willfaught, there is a binary file used for performance reasons, but as I understand it, the uploaded data is purposefully in human-readable JSON. From the design post:

Recall from above that the local, binary counter files are stored in /go/telemetry/local/. When a report is uploaded, the exact JSON that was uploaded is written to /go/telemetry/uploaded/, named for the day of the upload (2006-01-02.json). The aim of both these directories (including their naming) is to make the system’s overall operation as transparent as possible.

If the Go toolchain was to be underhanded about what it is doing, information could in theory be smuggled in a series of words or a series of numbers.

To reduce the chance of the Go toolchain doing something underhanded here, including regarding how it handles strings, I suspect someone opting in will need to rely at least in part on the Go toolchain being open source (and either look at the code themselves, or rely on the likelihood of others doing so, the reproducibility of toolchain builds, the code review process, etc.).

If someone crosses that threshold of trust and is willing to choose to opt in, I tend to agree with @Merovius that plain text strings are easier to understand and help with transparency more than a series of numbers.

@AndrewHarrisSPU
Copy link

Most of the data sent will be counters/numbers anyway. The format is binary, if I remember correctly, so it can be mmap'ed. This wouldn't affect the human readability of the data either way.

From https://research.swtch.com/telemetry-design:

... the local, binary counter files are stored in /go/telemetry/local/. When a report is uploaded, the exact JSON that was uploaded is written to /go/telemetry/uploaded/, named for the day of the upload (2006-01-02.json).

I think this is saying the counter files are locally compact, but emitted as JSON. In the original thread someone made the point that plaintext is a good clue to someone who is sniffing traffic what's going on, that made sense to me. In a wonky way, a set of plaintext strings is already a set of unique numbers in this kind of scenario.

@willfaught
Copy link
Contributor

I see. It's hard to judge the readability of {"foo": 12} vs. {"46": 12} without knowing what foo might be, but if the string index is published, people can work out for themselves that {"46": 12} means {"error_when_build_stdlib": 12}. Most of the time, nobody is going to be looking at these, especially if they have numbered keys, so optimizing their readability doesn't seem that important. It would be trivial to write a little tool people could use to translate the string IDs in what was sent into the strings themselves, if they want to know.

I think the bigger concern—bigger than the malicious sending of personal data strings—is the accidental sending of personal data strings. Nobody wants "home_video_of_[embarrassing thing].mov" or "[embarrassing thing]website.com/go-foo/bar" accidentally sent to the Go team. And perhaps no one will file an issue to fix it either, out of embarrassment, when the problem is discovered. Not sending any strings at all rules out that possibility entirely.

@ianlancetaylor
Copy link
Contributor

By design only strings that are explicitly requested by the telemetry server will be sent to the telemetry server. Of course we can't stop anyone from modifying the Go tools to send other strings, but it hardly seems likely to happen by accident.

@doggedOwl
Copy link

@willfaught publishing numbers instead of strings is a false sense of security. It accomplishes nothing. Who does it protect from when the strings are public and can be consulted anytime by anyone? Users would be more comfortable reading and understating right there what is being sent instead of cross-referencing that is a burden for the user an no burden for any would-be malicious actors.
As it is, with the change to the Opt-in Mechanism, this is a very good balanced approach.

@Merovius
Copy link
Contributor

@willfaught

And I doubt it would entail as much hassle as you make it out to be, once a process is in place. For instance, the index can be immutable, and added to over time and releases with an AST scan.

Note that one class of strings are counter names and the counter names might be a full stack trace (function names and line number offsets). The space is large. So, no, I don't think I'm overstating the hassle.

And, again, I think this actually worsens the privacy and transparency of the design and makes it easier to abuse, if anything. I agree that doing extra work might be worth it, if it improved privacy - but in this case, it's a lose-lose, you'd have to do extra work to make things worse.

@willfaught
Copy link
Contributor

Note that one class of strings are counter names and the counter names might be a full stack trace (function names and line number offsets). The space is large. So, no, I don't think I'm overstating the hassle.

Ah, I see, then the string space might indeed be very large. Makes sense.

@ainar-g
Copy link
Contributor

ainar-g commented Mar 16, 2023

For those who might be interested, if users will be able to inspect the reports:

The counter files are stored in the directory <user>/go/telemetry/local/, where <user> is the user configuration directory, as reported by os.UserConfigDir. […]

(From https://research.swtch.com/telemetry-design#counting.)

By the way. Is there a reason why it's in ~/go/ as opposed to in whatever $GOPATH is set to? Nevermind, I misread that.

@seankhliao
Copy link
Member

By the way. Is there a reason why it's in ~/go/ as opposed to in whatever $GOPATH is set to?

It's not ~/go/, it's a directory under user config, the same place env config (go env -w) is stored.

@rsc
Copy link
Contributor Author

rsc commented Mar 29, 2023

Have all concerns about this proposal been addressed?

@rsc
Copy link
Contributor Author

rsc commented Apr 6, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor Author

rsc commented Apr 12, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: all: add opt-in transparent telemetry to Go toolchain all: add opt-in transparent telemetry to Go toolchain Apr 12, 2023
@rsc rsc modified the milestones: Proposal, Backlog Apr 12, 2023
@cuiweixie
Copy link
Contributor

cuiweixie commented May 26, 2023

Is there a env that may like GOCOLLECTIONSERVER to set the address of collection server? eg. GOCOLLECTIONSERVER=go.dev

@gopherbot
Copy link

Change https://go.dev/cl/559199 mentions this issue: cmd: vendor golang.org/x/telemetry and call counter.Open in cmd/go

gopherbot pushed a commit that referenced this issue Jan 30, 2024
This change vendors golang.org/x/telemetry and calls counter.Open in
cmd/go right at the beginning.

For #58894

Change-Id: Iae56328440614b213c1429972e6f68f22c2112cd
Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-windows-amd64-longtest,gotip-linux-amd64-longtest-race
Reviewed-on: https://go-review.googlesource.com/c/go/+/559199
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link

Change https://go.dev/cl/559519 mentions this issue: cmd/go: add telemetry counters for flag names and subcommand

@gopherbot
Copy link

Change https://go.dev/cl/559795 mentions this issue: cmd: update golang.org/x/telemetry to 1cb064e and vendor it

gopherbot pushed a commit that referenced this issue Feb 1, 2024
This brings in CL 559505 which adds a stub for counter.CountFlags so
it can be depended on and still build on Go 1.18 and earlier. This
will allow the go command to use counter.CountFlags and still be able
to build as the bootstrap command with an earlier version of Go.

For #58894

Change-Id: I31d5b96bd47eef2e407ef97e6146adece403f2c0
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/559795
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
gopherbot pushed a commit that referenced this issue Feb 1, 2024
For #58894

Change-Id: I6b5d5b14be9858f5855eeac0110aa44e762cee03
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/559519
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/562715 mentions this issue: cmd: update golang.org/x/telemetry and its dependencies

@gopherbot
Copy link

Change https://go.dev/cl/562735 mentions this issue: cmd/go: add telemetry counters for flag names and subcommand

gopherbot pushed a commit that referenced this issue Feb 8, 2024
This change was produced with
	go get golang.org/x/telemetry@latest
	go mod tidy
	go mod vendor

For #65586,#58894

Change-Id: I631a424ebb726fb0999d4b5d1e6e7a288b475344
Cq-Include-Trybots: luci.golang.try:gotip-windows-386,gotip-windows-amd64-longtest,gotip-solaris-amd64,gotip-openbsd-amd64,gotip-wasip1-wasm_wazero,gotip-js-wasm
Reviewed-on: https://go-review.googlesource.com/c/go/+/562715
TryBot-Bypass: Michael Matloob <matloob@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Michael Matloob <matloob@golang.org>
Commit-Queue: Michael Matloob <matloob@golang.org>
gopherbot pushed a commit that referenced this issue Feb 8, 2024
For #58894,#65586

This is a revert of CL 560655 which was a revert of CL 559519.
CL 559519 was reverted because it was broken on windows/386. But now
CL 562715 pulls in x/telemetry CL 560462 which disables telemetry on
windows/386, fixing that issue.

Change-Id: I094e90c28bca02f2303807d3b008f2ef9d59433c
Reviewed-on: https://go-review.googlesource.com/c/go/+/562735
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Michael Matloob <matloob@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link

Change https://go.dev/cl/564555 mentions this issue: cmd/go: add file with list of all counters we collect

ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
This change vendors golang.org/x/telemetry and calls counter.Open in
cmd/go right at the beginning.

For golang#58894

Change-Id: Iae56328440614b213c1429972e6f68f22c2112cd
Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-windows-amd64-longtest,gotip-linux-amd64-longtest-race
Reviewed-on: https://go-review.googlesource.com/c/go/+/559199
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
This brings in CL 559505 which adds a stub for counter.CountFlags so
it can be depended on and still build on Go 1.18 and earlier. This
will allow the go command to use counter.CountFlags and still be able
to build as the bootstrap command with an earlier version of Go.

For golang#58894

Change-Id: I31d5b96bd47eef2e407ef97e6146adece403f2c0
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/559795
Reviewed-by: Bryan Mills <bcmills@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
For golang#58894

Change-Id: I6b5d5b14be9858f5855eeac0110aa44e762cee03
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/559519
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
This change was produced with
	go get golang.org/x/telemetry@latest
	go mod tidy
	go mod vendor

For golang#65586,golang#58894

Change-Id: I631a424ebb726fb0999d4b5d1e6e7a288b475344
Cq-Include-Trybots: luci.golang.try:gotip-windows-386,gotip-windows-amd64-longtest,gotip-solaris-amd64,gotip-openbsd-amd64,gotip-wasip1-wasm_wazero,gotip-js-wasm
Reviewed-on: https://go-review.googlesource.com/c/go/+/562715
TryBot-Bypass: Michael Matloob <matloob@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Michael Matloob <matloob@golang.org>
Commit-Queue: Michael Matloob <matloob@golang.org>
ezz-no pushed a commit to ezz-no/go-ezzno that referenced this issue Feb 18, 2024
For golang#58894,golang#65586

This is a revert of CL 560655 which was a revert of CL 559519.
CL 559519 was reverted because it was broken on windows/386. But now
CL 562715 pulls in x/telemetry CL 560462 which disables telemetry on
windows/386, fixing that issue.

Change-Id: I094e90c28bca02f2303807d3b008f2ef9d59433c
Reviewed-on: https://go-review.googlesource.com/c/go/+/562735
Reviewed-by: Bryan Mills <bcmills@google.com>
Run-TryBot: Michael Matloob <matloob@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link

Change https://go.dev/cl/568257 mentions this issue: cmd/go: call telemetry.Start at the beginning

gopherbot pushed a commit that referenced this issue Mar 4, 2024
Maintain a list of counters we collect and test that it hasn't
changed. If it has, fail a test and have the user update the list. The
update process will print a reminder to update the list of collected
counters.

Also run go mod vendor to pull in
golang.org/x/telemetry/counter/countertest.

For #58894

Change-Id: I661a9c3d67cb33f42a5519f4639af7aa05c3821d
Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/564555
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/570679 mentions this issue: cmd/compile: increment counters for passed in flag names

@gopherbot
Copy link

Change https://go.dev/cl/570736 mentions this issue: cmd/go: change some counter names

gopherbot pushed a commit that referenced this issue Mar 13, 2024
Primarily, this change removes the cmd/ prefix on the go command
counter names. The 'error' counter is changed to 'errors' reflecting
that it's a bucket that contains multiple errors. the switch-exec and
select-exec counters are moved into a 'toolchain' grouping.

For #58894

Change-Id: Id6e0e7a0b4a5e42a0aef04b1210d2bb5256eb6c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/570736
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests