Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/go/ssa/ssautil: Reachable: conservative approximation to reachable functions, runtime types #69291

Open
adonovan opened this issue Sep 5, 2024 · 13 comments
Assignees
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) Proposal Proposal-Accepted
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Sep 5, 2024

Background: The existing ssautil.AllFunctions helper performs a reachability analysis starting from all the packages in a Go SSA program, and returns the set of functions it encounters:

package ssautil

func AllFunctions(prog *ssa.Program) map[*ssa.Function]bool

Unfortunately it is poorly defined and its implementation is a bit of a historical mess; it is complicated and delivers results that are simultaneously "too much" (imprecise) and "too little" (unsound).

An example of imprecision is that it searches the body of every function in the ssa.Program for MakeInterface operations, when only those functions that are reachable from the entry points need be considered. Another example is that it uses all packages of the program as roots for reachability, even though typically the program consists of all dependencies of a few packages of interest.

An example of unsoundness is that it doesn't consider all of the types derivable from the public API of a package as potential dynamic call targets.

Furthermore, its traversal is a lost opportunity to to track information ("address-taken" status of functions) that would be useful to CHA and VTA.

Proposal: We propose to add a new function ssautil.Reachable that:

  • is derived from first principles
  • is analytically sound (conservative)
  • returns additional information useful to CHA, VTA, and vulncheck, that will improve the precision of their results while reducing their running time.

We plan to redefine the existing AllFunctions as a shallow wrapper around the new function.

package ssautil

// Reachable returns a conservative approximation to the set of
// functions and runtime types that are potentially reachable from the
// specified packages (which need not form a whole program).
//
// Any source function in package P that is not among the functions
// (map keys) returned by Reachable(P) is unreachable ("dead
// code"--though for dead-code analysis of whole programs, we
// recommend the more precise RTA).
// Beware: the values of the functions map are not all "true": the
// result is a mapping from functions to their "address-taken" status.
//
// The runtime types returned by Reachable are those that are
// potentially converted to interfaces, making them potential targets
// of interface method calls.
//
// The results are computed using the following algorithm.
//
// First, we tabulate a set of initial functions and runtime types
// from the specified packages.
//
// For packages named "main", the initial functions are the entry
// points: the main function and the synthetic package initializer.
// There are no initial runtime types.
//
// For importable packages, the initial functions include the package
// initializer and all exported non-parameterized functions; these
// functions are assumed to be potentially callable from external
// packages not visible to the analysis. The initial runtime types are
// the types of every exported non-parameterized declaration; these
// types are accessible to reflection from external packages.
//
// Second, we compute a fixed point of the following induction rules:
//
//  1. Each function's instructions are analyzed. Any function
//     referenced by an instruction is added to the set. Any
//     non-interface type used as the operand of a MakeInterface
//     instruction is added to the set of runtime types.
//
//  2. Each type is analyzed. Every element type found by recursively
//     removing type constructors is added to the set of runtime types.
//
//  3. For each type in the set, its methods are added to the set of functions.
//
// The initial exported functions are all assumed to be
// "address-taken", making them candidate targets of dynamic function
// calls. All methods found by rule 3 are similarly considered
// address-taken. Functions found by rule 1 are considered
// address-taken unless they are used in the call position of an
// [ssa.CallInstruction].
//
// Finally, it returns the final sets of functions and runtime types.
//
// For best results, provide packages from a Program constructed with the
// [ssa.InstantiateGenerics] mode flag.
func Reachable(pkgs []*ssa.Package) (functions map[*ssa.Function]bool, runtimeTypes *typeutil.Map)

@timothy-king @zpavlinovic

@gabyhelp
Copy link

gabyhelp commented Sep 5, 2024

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@adonovan adonovan self-assigned this Sep 5, 2024
@adonovan adonovan moved this to Incoming in Proposals Sep 5, 2024
@adonovan adonovan added the Analysis Issues related to static analysis (vet, x/tools/go/analysis) label Sep 5, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/609281 mentions this issue: go/callgraph/cha: make CHA, VTA faster and more precise

@zpavlinovic
Copy link
Contributor

...
// First, we tabulate a set of initial functions and runtime types
// from each package in the program.

I guess you mean pkgs here?

...
// For best results, provide a Program constructed with the
// [ssa.InstantiateGenerics] mode flag.
func Reachable(pkgs []*ssa.Package) (functions map[*ssa.Function]bool, runtimeTypes *typeutil.Map)

Ditto?

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/610939 mentions this issue: go/ssa: extract type recursion into a helper package

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/611535 mentions this issue: internal/callgraph/chautil: add Reachable function

gopherbot pushed a commit to golang/tools that referenced this issue Sep 9, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This change moves ssa.forEachReachable into
internal/typesinternal.ForEachElement, simplifies
the signature (by internalizing the type map part)
and adds a test.

There are two copies of this algorithm (the other in
go/callgraph/rta), and we will need it again in
ssautil.Reachable (see golang/go#69291).
A follow-up change will make the copy in rta delegate
to this one (small steps).

Also, make ssa.Program.RuntimeTypes do the type analysis
when it is called, instead of doing it eagerly each time
we encounter a MakeInterface instruction.
This should reduce eventually costs since RuntimeTypes
shouldn't be needed: it was added for the pointer analysis
(since deleted) and it used by ssautil.AllFunctions (to
be replaced, see golang/go#69291), and in both cases it
is the wrong tool for the job because:

(a) it is more precise to accumulate runtime types in
    the subset of the program of interest, while doing
    some kind of reachability fixed-point computation;
    and

(b) its use in AllFunctions is unsound because although
    it accounts for all (too many!) MakeInterface operations
    it does not account for types exposed through public
    API (see the proposed replacement, ssautil.Reachable)
    when analyzing incomplete programs.

Updates golang/go#69291

Change-Id: Ib369278e50295b9287fe95c06169b81425193e90
Reviewed-on: https://go-review.googlesource.com/c/tools/+/610939
Reviewed-by: Tim King <taking@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@rsc rsc moved this from Incoming to Active in Proposals Sep 11, 2024
@rsc
Copy link
Contributor

rsc commented Sep 11, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@aclements
Copy link
Member

This seems reasonable. My main concern is that x/tools has traditionally had a slow accumulation of analysis cruft and I don't want this to also become more future cruft. The thing that seems most over-fit to the current needs is the way address-taken information is returned. I see the need for that, but worry that the map will be easy to use incorrectly.

@adonovan
Copy link
Member Author

x/tools has traditionally had a slow accumulation of analysis cruft and I don't want this to also become more future cruft.

Yeah, we know. ;-) If it's any consolation, some of this cruft predated the Go command's support for internal directories, and we have been steadily tidying things:

@aclements
Copy link
Member

@adonovan , for the record, could you explain what would be updated to use this and what the effects would be?

@adonovan
Copy link
Member Author

adonovan commented Oct 2, 2024

could you explain what would be updated to use this and what the effects would be?

In the CL stack of https://go.dev/cl/609281 linked above, the CL https://go.dev/cl/609281 uses the new Reachable function to improve the performance and precision of CHA by producing a smaller set of candidate dynamic call targets (address-taken functions only), and thus a smaller and more precise call graph in less time.

Since VTA uses CHA as a first step, this also improves the performance of VTA, though not its precision, since it eventually gets the same answer at the cost of more work (deleting spurious edges).

Govulncheck uses VTA (indeed, it is the most expensive step), so this change makes the govulncheck tool much faster. Govulncheck lives in a different repo (x/vuln), which is why Reachable cannot be a mere internal function of x/tools (as it is in my current CL stack). And of course other external clients of VTA would also benefit.

--

The proposed Reachable function is not just more precise, it is "more sound" (excuse the oxymoron), because it does the fixed point iteration of functions and dynamic types together. By contrast, the old implementation used for its set of dynamic types only the types encountered during SSA building, which doesn't include types from the public API of each importable package (when analyzing partial programs) nor types and functions created by generic instantiation during the fixed point iteration. I expect the "increased soundness" to be minor though.

@aclements
Copy link
Member

Have all remaining concerns about this proposal been addressed?

The proposal is in #69291 (comment)

@aclements aclements moved this from Active to Likely Accept in Proposals Nov 6, 2024
@aclements
Copy link
Member

Based on the discussion above, this proposal seems like a likely accept.

The proposal is in #69291 (comment)

@aclements aclements moved this from Likely Accept to Accepted in Proposals Nov 13, 2024
@aclements
Copy link
Member

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.

The proposal is in #69291 (comment)

@aclements aclements changed the title proposal: x/tools/go/ssa/ssautil: Reachable: conservative approximation to reachable functions, runtime types x/tools/go/ssa/ssautil: Reachable: conservative approximation to reachable functions, runtime types Nov 13, 2024
@aclements aclements modified the milestones: Proposal, Backlog Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) Proposal Proposal-Accepted
Projects
Status: Accepted
Development

No branches or pull requests

7 participants