testing: annotate output text type #62728

mknyszek · 2023-09-19T15:31:13Z

Currently the go test -json provides no way to disambiguate output produced by (*testing.T).Error{,f} and (*testing.T).Fatal{,f} from other test logging. This fact makes it difficult for CI systems to identify what part of the output is most pertinent for constructing summaries of test failures.

Consider a test that failed, producing a lot of log messages as a result. Reporting the full log of the test can often be distracting and overly specific, resulting in many CI systems (such as LUCI) not constructing test failure summaries for Go tests at all.

Heuristics such as taking the last line logged before a test fails work often, but not often enough to fully solve the problem. Enough tests call (*testing.T).Error{,f} and fail later (after more log output) that this heuristic is unreliable. This works even less well when considering tests that crash or panic: that last line is often going to be some random line in the stack trace.

I propose adding a boolean field to "output" actions in the JSON produced by cmd/test2json called "Error". This field would be "true" for any "output" actions that were produced via (*testing.T).Error{,f} and (*testing.T).Fatal{,f}. This field would be also be "true" for the first few lines of a crash (but notably not the stack trace itself).

For example:

{"Time":"2023-09-19T15:19:12.447816654Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"--- FAIL: TestHeap (0.00s)\n","Error":false}
{"Time":"2023-09-19T15:19:12.449904543Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"panic: ahhh [recovered]\n","Error":true}
{"Time":"2023-09-19T15:19:12.449923813Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"\tpanic: ahhh\n","Error":true}
{"Time":"2023-09-19T15:19:12.449928103Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"\n","Error":false}
{"Time":"2023-09-19T15:19:12.449932593Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"goroutine 6 [running]:\n","Error":false}
{"Time":"2023-09-19T15:19:12.449936283Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"testing.tRunner.func1.2({0x51ad20, 0x577910})\n","Error":false}
{"Time":"2023-09-19T15:19:12.449940423Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"\t/usr/local/google/home/mknyszek/work/go-1/src/testing/testing.go:1545 +0x24a\n","Error":false}
{"Time":"2023-09-19T15:19:12.449953903Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"testing.tRunner.func1()\n","Error":false}

Which lines of a crash get marked would be entirely up to cmd/test2json. Because it lives in the standard library, it can reasonably be coupled (with tests) to the crash output that the Go runtime produces.

Testing binaries' -test.v=test2json output would need to be adjusted such thatcmd/test2json could interpret errors correctly. For this, I propose the use of a new marker character (I am not picky as to which) at the beginning of a frame (delineated currently by ^V (0x16)).

The text was updated successfully, but these errors were encountered:

bcmills · 2023-09-19T17:13:30Z

cmd/test2json consumes the stdout from a test process as its input. Presumably the testing package would need to add some kind of new framing to communicate whether the text that follows indicates an error?

mknyszek · 2023-09-19T17:45:00Z

Yup! I don't know the history of how ^V (0x16) was chosen as the current framing marker, but we would need another one probably. I'm not really sure how to pick another one, though. I assume it should probably be invisible again? I'm open to suggestions.

patrickmeiring · 2023-09-20T02:56:21Z

The proposal looks good to me from a LUCI perspective.

Implementation-wise, with the frame markers - is there any concern about race conditions when tests are run in parallel? Or maybe this is the best that is available with the current implementation.

Beyond this proposal - if I can offer a CI perspective on what sort of structured data may be useful for a language like golang to generate -:

Structured reasons for build failures. I note there is an existing proposal in this regard: cmd/go: add build -json, to report build failures in JSON #23037
Stack traces for test failures (avoiding the need to scrape logs, e.g. if the user provides an error directly to t.Error() or if the test panics). We do not use these in LUCI Analysis yet, but there have been requests along this line. Having the stack traces seperated out can be useful for presentation reasons, to cluster them for test bug triage, and/or to match stack traces in seen test failures with stack traces seen out in the wild from crash telemetry.

One of the best test frameworks I have come across so far in terms of structured result output is Tast [1], the integration-testing framework for ChromeOS. It captures the time, reason, file and stack trace for all errors that cause a test to fail [2].

[1] https://chromium.googlesource.com/chromiumos/platform/tast/
[2] https://source.chromium.org/chromium/infra/infra/+/main:go/src/infra/tools/result_adapter/tast.go;l=63

patrickmeiring · 2023-09-26T23:21:16Z

A few further thoughts:

Where long output is produced, is it guaranteed that all output to one .Fatal() or .Error() call will be output on one test2json output line? I was under the impression that test2json pulls from stdout which is buffered. If one error may be split over two or more test2json output lines, could this affect the ability to recover the original errors? (observing one test may produce multiple errors)
I note the proposal does not specify if the error is fatal or not (i.e. comes from .Fatal(...) or .Error(...)). For failure reasons in LUCI Analysis, we typically capture the first fatal error, or the first error otherwise: https://source.chromium.org/chromium/infra/infra/+/main:go/src/go.chromium.org/luci/resultdb/proto/v1/failure_reason.proto;l=30?q=resultdb%20failure%20reason
- I think this is because fatal errors are generally seen as more interesting (panics, etc.)
- But you could always go with the first error if you prefer. The main thing would be to be consistent with the uploaded failure reason.

rsc · 2023-10-03T00:51:23Z

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

rsc · 2023-10-04T17:53:08Z

This seems reasonable, and following the existing framing (finding another little-used control character) seems fine too.

rsc · 2023-10-11T17:44:59Z

Have all remaining concerns about this proposal been addressed?

rsc · 2023-10-24T17:31:38Z

It looks like ^N and ^O are available and make a reasonable amount of sense, as these things go.
^N means start error text ("shift in") and ^O means back to regular text ("shift out").

rsc · 2023-10-24T18:07:38Z

@patrickmeiring For LUCI, I think first error makes the most sense and is the most consistent. Fatal is defined as Error+FailNow, and some programs or library code do call FailNow themselves. Code should not distinguish Error-generated messages from Fatal-calling-Error-generated messages.

rsc · 2023-10-24T18:11:57Z

@mknyszek I'm not sure about the part where the first few lines of a crash are tagged "Error":true. It seems clearer conceptually and certainly easier implementation-wise to only tag messages from t.Fatal/t.Error with "Error":true. To tag runtime crashes you'd have to make the runtime emit the ^N and ^O or else have some easily-fooled pattern matching in test2json.

#61839 is (now) about reporting a non-zero exit status in -json output. Perhaps that would be enough for crashes? You'd see the exit status and click on the full output and see the crash text.

patrickmeiring · 2023-10-24T22:06:10Z

@rsc

Re: Selecting first error - that sounds fine to me.
Re: test crashes - extracting the reasons for these would be useful as this is one of the main ways tests fail. Is there no standard panic recovery code path used in go test that we can hook into?
Re: Have all remaining concerns about this proposal been addressed?
I am seeking clarification on the behaviour if the program emits multiple errors, e.g.

t.Error(strings.Repeat("a", 1_000_000))
t.Error(strings.Repeat("b", 1_000_000))

Will it be possible to identify there were two errors, and to distinguish error a (with possibly long output) from error b?
See comment about buffering above. @mknyszek

rsc · 2023-10-25T17:33:00Z

Will it be possible to identify there were two errors, and to distinguish error a (with possibly long output) from error b?

Yes, there will be two ^N...^O sequences in the output.

patrickmeiring · 2023-10-26T01:02:08Z

Yes, there will be two ^N...^O sequences in the output.

How will these sequences be distinguished in cmd/test2json output?

It looks to me like cmd/test2json is stripping out those control characters and slicing the sequences into an arbitrary number of JSON messages.

E.g. the following example output from comment 1:

{"Time":"2023-09-19T15:19:12.449904543Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"panic: ahhh [recovered]\n","Error":true}
{"Time":"2023-09-19T15:19:12.449923813Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"\tpanic: ahhh\n","Error":true}

Looks like it is all the same error based on the "Output" field.

But this is structurally the same as the output I would expect from two errors (aaaaaaa...) and (bbbbb.....):

{"Time":"2023-09-19T15:19:12.449904543Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"aaaaaaaaaaa","Error":true}
{"Time":"2023-09-19T15:19:12.449923813Z","Action":"output","Package":"golang.org/x/exp/trace","Test":"TestHeap","Output":"bbbbbbbbbb","Error":true}

aclements · 2023-10-28T15:19:00Z

To tag runtime crashes you'd have to make the runtime emit the ^N and ^O or else have some easily-fooled pattern matching in test2json.

Re: test crashes - extracting the reasons for these would be useful as this is one of the main ways tests fail. Is there no standard panic recovery code path used in go test that we can hook into?

I think we could easily tag panics that are recovered and re-panicked by the testing package itself. That doesn't capture runtime fatal panics or panics from other goroutines, but I imagine that would get the vast majority of test panics and should be easy to implement in the testing package.

rsc · 2023-11-02T14:14:48Z

@patrickmeiring I see what you mean. We need two different ways to flag an output line: "start of error" and "continuation of error". Perhaps we should use

"OutputType": "error"
"OutputType": "error-continue"

and that leaves the door open for other kinds of OutputType as well, like we could have "OutputType": "frame" for all the framing text ("=== RUN" and so on) so test systems can decide to gray it out.

?

rsc · 2023-11-02T17:16:34Z

Have all remaining concerns about this proposal been addressed?

In go test -json output, the "Action":"output" lines would be annotated with an optional new field "OutputType" giving the kind of output included in the entry. The initial kinds would be:

“error”, meaning text produced by t.Error/t.Fatal
“error-continue”, meaning a continuation line from the previous “error”
“frame”, meaning framing text like “=== RUN TestCase” that is already expressed more directly in other events

The specific implementation of “error” and “error-continue” would be that in test2json mode the test would print error text wrapped in ^N...^O, and test2json would recognize and remove these, the same as it does today for ^V around framing.

patrickmeiring · 2023-11-08T02:24:41Z

Have all remaining concerns about this proposal been addressed?

The only open question I can see is whether to capture panics on other (non-test) goroutines and report them as errors. My take here is:

If a panic on another goroutine crashes the go test process instead of just failing a test, I think it is fine to not frame them with ^N and ^O (although it may still be useful to do so).
On the other hand, if a panic on an other goroutine is attributed to a test and ends up failing that test, then it would be better to frame the error so it can be reported as the test failure reason.

Either way, the resolution of this issue need not block the proposal as even a partial ability to extract failure reasons will be a significant improvement over the status quo.

rsc · 2023-11-10T17:28:21Z

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

In go test -json output, the "Action":"output" lines would be annotated with an optional new field "OutputType" giving the kind of output included in the entry. The initial kinds would be:

“error”, meaning text produced by t.Error/t.Fatal
“error-continue”, meaning a continuation line from the previous “error”
“frame”, meaning framing text like “=== RUN TestCase” that is already expressed more directly in other events

The specific implementation of “error” and “error-continue” would be that in test2json mode the test would print error text wrapped in ^N...^O, and test2json would recognize and remove these, the same as it does today for ^V around framing.

rsc · 2023-11-16T10:11:37Z

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

In go test -json output, the "Action":"output" lines would be annotated with an optional new field "OutputType" giving the kind of output included in the entry. The initial kinds would be:

“error”, meaning text produced by t.Error/t.Fatal
“error-continue”, meaning a continuation line from the previous “error”
“frame”, meaning framing text like “=== RUN TestCase” that is already expressed more directly in other events

The specific implementation of “error” and “error-continue” would be that in test2json mode the test would print error text wrapped in ^N...^O, and test2json would recognize and remove these, the same as it does today for ^V around framing.

gopherbot added this to the Proposal milestone Sep 19, 2023

gopherbot added the Proposal label Sep 19, 2023

rsc changed the title ~~proposal: testing,cmd/test2json: distinguish error output~~ proposal: testing: annotate output text type Nov 2, 2023

dnephin mentioned this issue Nov 4, 2023

junitxml: include full output in system-out field gotestyourself/gotestsum#380

Open

rsc added the Proposal-FinalCommentPeriod label Nov 10, 2023

rsc changed the title ~~proposal: testing: annotate output text type~~ testing: annotate output text type Nov 16, 2023

rsc modified the milestones: Proposal, Backlog Nov 16, 2023

rsc added Proposal-Accepted and removed Proposal-FinalCommentPeriod labels Nov 16, 2023

bcmills mentioned this issue Dec 13, 2023

proposal: cmd/test2json: Allow Go Tests to Pass Metadata #43936

Open

dnephin mentioned this issue Dec 31, 2023

proposal: testing: Add T.Output() etc #59928

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

testing: annotate output text type #62728

testing: annotate output text type #62728

mknyszek commented Sep 19, 2023 •

edited

bcmills commented Sep 19, 2023 •

edited

mknyszek commented Sep 19, 2023

patrickmeiring commented Sep 20, 2023

patrickmeiring commented Sep 26, 2023 •

edited

rsc commented Oct 3, 2023

rsc commented Oct 4, 2023 •

edited

rsc commented Oct 11, 2023

rsc commented Oct 24, 2023

rsc commented Oct 24, 2023

rsc commented Oct 24, 2023

patrickmeiring commented Oct 24, 2023

rsc commented Oct 25, 2023

patrickmeiring commented Oct 26, 2023 •

edited

aclements commented Oct 28, 2023

rsc commented Nov 2, 2023 •

edited

rsc commented Nov 2, 2023

patrickmeiring commented Nov 8, 2023

rsc commented Nov 10, 2023

rsc commented Nov 16, 2023

testing: annotate output text type #62728

testing: annotate output text type #62728

Comments

mknyszek commented Sep 19, 2023 • edited

bcmills commented Sep 19, 2023 • edited

mknyszek commented Sep 19, 2023

patrickmeiring commented Sep 20, 2023

patrickmeiring commented Sep 26, 2023 • edited

rsc commented Oct 3, 2023

rsc commented Oct 4, 2023 • edited

rsc commented Oct 11, 2023

rsc commented Oct 24, 2023

rsc commented Oct 24, 2023

rsc commented Oct 24, 2023

patrickmeiring commented Oct 24, 2023

rsc commented Oct 25, 2023

patrickmeiring commented Oct 26, 2023 • edited

aclements commented Oct 28, 2023

rsc commented Nov 2, 2023 • edited

rsc commented Nov 2, 2023

patrickmeiring commented Nov 8, 2023

rsc commented Nov 10, 2023

rsc commented Nov 16, 2023

mknyszek commented Sep 19, 2023 •

edited

bcmills commented Sep 19, 2023 •

edited

patrickmeiring commented Sep 26, 2023 •

edited

rsc commented Oct 4, 2023 •

edited

patrickmeiring commented Oct 26, 2023 •

edited

rsc commented Nov 2, 2023 •

edited