Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: x/tools/go/analysis: add Pass.Module field #66315

Open
adonovan opened this issue Mar 14, 2024 · 17 comments
Open

proposal: x/tools/go/analysis: add Pass.Module field #66315

adonovan opened this issue Mar 14, 2024 · 17 comments
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) Proposal Proposal-FinalCommentPeriod
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Mar 14, 2024

Proposal Details

Some analyzers need information about the module to which a package belongs, such as its version of Go. We have already exposed this information in x/tools/go/packages (and I observe that most features of packages.Package that are plausibly build system-agnostic eventually end up needing to be exposed through analysis.Pass too).

I propose that we define a Pass.Module field, something like this: [see bottom of note for most current proposal]

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	Replace   *Module      // replaced by this module
	Time      *time.Time   // (optional) time version was created
	Main      bool         // is this the main module?
	Indirect  bool         // is this module only an indirect dependency of main module?
	Dir       string       // directory holding files for this module, if any
	GoMod     string       // path to go.mod file used when loading this module, if any
	GoVersion string       // go version used in module
	Error     error        // error loading module
}

This set of fields exactly matches what is in packages.Module. Clearly this is the maximal set of fields, since a go/packages-based driver will not be able to set any additional (non-derived) fields. But perhaps a more minimal set might be a safer starting point; suggestions welcome.

Build systems other than go list, such as Bazel, Blaze, Pants, Buck, Please and so on, will need to simulate the module information if they do not use go.mod files. I am interested to hear from maintainers of Go integrations in those systems whether this set of fields looks manageable. See:

Update (Apr 4):

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	GoVersion string       // go version used in module
}
@fmeum
Copy link

fmeum commented Mar 15, 2024

Thanks for the heads-up.

Bazel's rules_go has adopted go.mod as the source of truth for dependency declarations with Bazel's new dependency management system Bzlmod (https://github.com/bazelbuild/rules_go/blob/master/docs/go/core/bzlmod.md#specifying-external-dependencies).

We would thus be able to fill these fields (maybe except for Time, but it is optional).

@timothy-king
Copy link
Contributor

My suggestion for a minimal set of fields {Path, Version, GoVersion}.

@timothy-king
Copy link
Contributor

Should we make Pass.Module an optional feature only to be filled in if it is requested by a boolean field on the Analyzer? Such as RequiresModule. This gives an understandable default of what to do when Module.Error != nil. If RequiresModule is on and Module.Error != nil, skip the pass and warn the user of an error.

@adonovan
Copy link
Member Author

adonovan commented Mar 18, 2024

That wouldn't be a compatible change. If we add the field, then a driver that hasn't implemented support for it yet will run analyzers unconditionally, despite their RequiresModule declaration, so they will crash.

Drivers and analyzers must both be able to handle missing module information.

@timothy-king
Copy link
Contributor

Good point. What do we want to do when Module.Error != nil then? Some obvious options:

  1. Not run the Pass under the premise there is an error on the package?
  2. Not provide the Pass the Module value, i.e. act as if it is missing?
  3. Provide the Pass the value of Module.Error to decide for itself?
  4. Something else?

The first option seems like the easiest to relax in the future.

@adonovan
Copy link
Member Author

Option 3. If the pass is running, it means we got a package (possibly with errors if RunDespiteErrors). The analyzer can deal with them.

@dr2chase
Copy link
Contributor

Maybe related, a recent very minimal CL (not yet submitted) to forward module path information from vet (which already contains module version). https://go.dev/cl/571016. Perhaps that CL should be revised, @michaelmatloob suggested a design document or issue to link to, neither of which yet exists, which is why I have not yet submitted.

There's no guarantee (as I understand it) that packages from which a vet analysis imports facts, have themselves been analyzed to ensure that those facts are exported. I verified that the export-import worked if I pre-ran the analyses by hand, but if I did not do that, then there are no facts to import.

@adonovan
Copy link
Member Author

Maybe related, a recent very minimal CL (not yet submitted) to forward module path information from vet (which already contains module version). https://go.dev/cl/571016. Perhaps that CL should be revised, @michaelmatloob suggested a design document or issue to link to, neither of which yet exists, which is why I have not yet submitted.

This proposal discussion will determine whether we go for a "maximalist" approach (plumb through everything from go/packages.Module) or a minimalist one (just module path and version). Either way, your linked CL will need to specify at least one additional field.

There's no guarantee (as I understand it) that packages from which a vet analysis imports facts, have themselves been analyzed to ensure that those facts are exported. I verified that the export-import worked if I pre-ran the analyses by hand, but if I did not do that, then there are no facts to import.

There is such a guarantee, except for the standard library. (That unfortunate exception is because the std lib is precompiled in Blaze and Bazel, not built on demand.) In other words, if a driver makes a pass of analyzer A on package P, where A uses facts and P imports Q, then the driver must already have executed pass (A, Q) successfully, unless Q is in std.

@timothy-king
Copy link
Contributor

Maybe related, a recent very minimal CL (not yet submitted) to forward module path information from vet (which already contains module version). https://go.dev/cl/571016.

@dr2chase Definitely related. For that CL to help an Analyzer, this proposal or something similar will need to be approved to give that data a place to be plumbed to. You may want to wait on this being approved.

It would also help to know what the minimal data you need is. Also your 2 cents on maximal interface or minimal interface would be appreciated.

@peterebden
Copy link

Thanks for looping us in on this!

Please's Go plugin has most of the module information available to it; these days it's typically synced from the go.mod file using Puku. Hence I expect we shouldn't have much trouble producing this information.

The only fields I expect we would be unlikely to fill in are Time (which is optional anyway), Replace (although there are similar mechanisms, we don't really have a first-class concept of it) and possibly Indirect (we have that information at some level but I don't know how available it will be to this analysis). I'd anticipate that's unlikely to cause any major issues?

Overall the proposal LGTM.

@timothy-king
Copy link
Contributor

I'd anticipate that's unlikely to cause any major issues?

If the fields are not optional, Analyzers can assume they are always provided. If not present, the Analyzer could crash (nil-pointer dereference) or otherwise misbehave.

The only fields I expect we would be unlikely to fill in are Time (which is optional anyway), Replace (although there are similar mechanisms, we don't really have a first-class concept of it) and possibly Indirect (we have that information at some level but I don't know how available it will be to this analysis).

@peterebden Is it fair to say that this is a request that Module.{Time,Replace,Indirect} are optional or not present?

@dr2chase
Copy link
Contributor

dr2chase commented Apr 1, 2024

@timothy-king Sorry for the late reply, been out-of-hemisphere and hacking on other problems also.

The information that (I think) I need is just Path and Version, but I am a little worried about the fields that I don't know for sure that I understand (that might affect the effective version, for example). So if in fact I need those fields to correctly answer "are these two packages from the same module or not?" or "what Go version applies to compilation of this package?" then they should be included, otherwise, if nobody has a use for them yet, then maybe wait. Maybe.

I approve of it being a pointer-to-a-thing, so that "nope, no module info here" can be signaled with a nil pointer.

One thing to remember is that we'll want to fill this in, in the .cfg that vet passes to analysis passes. I have a pair of CLs for doing this with fragileconversion and I can verify that with that change it works, without it does not (unitchecker, vet).

I went to look at the full set of module information available to go-build-vet-etc, and saw

type ModulePublic struct {
	Path       string           `json:",omitempty"` // module path
	Version    string           `json:",omitempty"` // module version
	Query      string           `json:",omitempty"` // version query corresponding to this version
	Versions   []string         `json:",omitempty"` // available module versions
	Replace    *ModulePublic    `json:",omitempty"` // replaced by this module
	Time       *time.Time       `json:",omitempty"` // time version was created
	Update     *ModulePublic    `json:",omitempty"` // available update (with -u)
	Main       bool             `json:",omitempty"` // is this the main module?
	Indirect   bool             `json:",omitempty"` // module is only indirectly needed by main module
	Dir        string           `json:",omitempty"` // directory holding local copy of files, if any
	GoMod      string           `json:",omitempty"` // path to go.mod file describing module, if any
	GoVersion  string           `json:",omitempty"` // go version used in module
	Retracted  []string         `json:",omitempty"` // retraction information, if any (with -retracted or -u)
	Deprecated string           `json:",omitempty"` // deprecation message, if any (with -u)
	Error      *ModuleError     `json:",omitempty"` // error loading module
	Sum        string           `json:",omitempty"` // checksum for path, version (as in go.sum)
	GoModSum   string           `json:",omitempty"` // checksum for go.mod (as in go.sum)
	Origin     *codehost.Origin `json:",omitempty"` // provenance of module
	Reuse      bool             `json:",omitempty"` // reuse of old module info is safe
}

and was curious about whether any of this should be included, if it happens to be available in an analysis framework.
Deprecated seemed relevant, perhaps.

@adonovan
Copy link
Member Author

adonovan commented Apr 4, 2024

Let's stick with the subset of fields Tim proposed: {Path, Version, GoVersion}; the rest are more obscure. We can always add more later when a clear need arises.

I've appended the updated API proposal to the bottom of the first comment.

Any objections?

@rsc
Copy link
Contributor

rsc commented Apr 4, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Apr 10, 2024

Have all remaining concerns about this proposal been addressed?

The proposal is to add

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	GoVersion string       // go version used in module
}

@rsc
Copy link
Contributor

rsc commented Apr 24, 2024

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

The proposal is to add

package analysis

type Pass struct {
    ...
    Module *Module // optional
}

// Module provides information about the module (if any) to which a package belongs.
type Module struct {
 	Path      string       // module path
	Version   string       // module version
	GoVersion string       // go version used in module
}

@timothy-king
Copy link
Contributor

I have a candidate implementation here https://go.dev/cl/577996.

One thing I observed was that Pass.Module.GoVersion is empty on locally checked out packages, and populated for dependencies on other modules. I believe this is WAI and what one would expect from go list. It may be confusing to have different results on the "same" package though.

IMO what we should do is document that "" means unknown and that this includes local packages so Analyzer writers are aware of this. Alternatively this could be a reason to have Pass.Module == nil to signal 'unknown'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) Proposal Proposal-FinalCommentPeriod
Projects
Status: Likely Accept
Development

No branches or pull requests

7 participants