Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slices: add iterator-related functions #61899

Open
rsc opened this issue Aug 9, 2023 · 40 comments
Open

slices: add iterator-related functions #61899

rsc opened this issue Aug 9, 2023 · 40 comments

Comments

@rsc
Copy link
Contributor

rsc commented Aug 9, 2023

We propose to add the following functions to package slices, to provide good support for code using iterators.
This is one of a collection of proposals updating the standard library for the new 'range over function' feature (#61405). It would only be accepted if that proposal is accepted. See #61897 for a list of related proposals.

All serves as a “source” for iterators.

// All returns an iterator over index-value pairs in the slice.
// The indexes range in the usual order, from 0 through len(s)-1.
func All[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem] {
	return func(yield func(int, Elem)bool) bool {
		for i, v := range s {
			if !yield(i, v) {
				return false
			}
		}
		return true
	}
}

Backward can be used to replace existing 3-clause loops that iterate backward over a slice manually. This happens fairly often in certain algorithms (for example it happens a lot in the compiler’s SSA code).

// Backward returns an iterator over index-value pairs in the slice,
// traversing it backward. The indexes range from len(s)-1 down to 0.
func Backward[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem] {
	return func(yield func(int, Elem)bool) bool {
		for i := len(v)-1; i >= 0; i-- {
			if !yield(i, v[i]) {
				return false
			}
		}
	}
}

Values is like All: not terribly useful by itself but useful as a source for other code.

// Values returns an iterator over the values in the slice,
// starting with s[0].
func Values[Slice ~[]Elem, Elem any](s Slice) iter.Seq[Elem] {
	return func(yield func(Elem)bool) bool {
		for _, v := range s {
			if !yield(v) {
				return false
			}
		}
		return true
	}
}

Append and Collect serve as “sinks” for iterators.

// Append appends the values from seq to the slice and returns the extended slice.
func Append[Slice ~[]Elem, Elem any](x Slice, seq iter.Seq[Elem]) Slice {
	for _, v := range seq {
		x = append(x, v)
	}
	return x
}
// Collect collects values from seq into a new slice and returns it.
func Collect[Elem any](seq iter.Seq[Elem]) []Elem {
	return Append([]Elem(nil), seq)
}

Sorted and SortedFunc collect an iterator and then sort the contents:

// Sorted collects values from seq into a new slice, sorts the slice, and returns it.
func Sorted[Elem comparable](seq iter.Seq[Elem]) []Elem {
	slice := Collect(seq)
	Sort(slice)
	return slice
}
// SortedFunc collects values from seq into a new slice, sorts the slice, and returns it.
func SortedFunc[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem {
	slice := Collect(seq)
	SortFunc(slice, cmp)
	return slice
}
@rsc rsc added the Proposal label Aug 9, 2023
@gopherbot gopherbot added this to the Proposal milestone Aug 9, 2023
@seankhliao
Copy link
Member

Concat takes a a variadic second arg #56353
Should Append and Collect (and Values?) do the same?

@mohamedattahri
Copy link
Contributor

mohamedattahri commented Aug 9, 2023

@rsc – Any reason why these some of these shouldn't live in an iterators package of their own? It seems inconsistent to have functions in slices where the first arg is not a slice.

@earthboundkid
Copy link
Contributor

@rsc – Any reason why these some of these shouldn't live in an iterators package of their own? It seems inconsistent to have functions in slices where the first arg is not a slice.

I think this makes sense here. The iters package would define the iter.Iter type and things like Map, Filter, TakeWhile, and Zip that don't deal with particular iterator sources. You import slices when you want to convert an iter.Iter to/from a slice, and e.g. maps.Keys when you want to iterate over a map.

@DeedleFake
Copy link

DeedleFake commented Aug 9, 2023

Values() seems like it would be redundant, other than as a convenience, I suppose, if a hypothetical iter package provided something like

// Better name pending.
func SecondRet[T1, T2 any](iter Seq2[T1, T2]) Seq[T2] {
  return func(yield func(T2) bool) {
    for _, v := range iter {
      if !yield(v) {
        return false
      }
    }
  }
}

Either that or All() yielding the index is redundant if an Enumerate() implementation is added, too.

Also, two more functions that I'd like to suggest:

// SortInto places the elements of iter into the already sorted slice dst, expanding as necessary, and returns the resulting slice, also sorted.
func SortInto[S ~[]E, E cmp.Ordered](dst S, seq Seq[E]) S
func SortIntoFunc[S ~[]E, E any](dst S, seq Seq, cmp func(E, E) int)

And Sorted() should have a cmp.Ordered constraint, not comparable.

@rsc
Copy link
Contributor Author

rsc commented Aug 9, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@hherman1
Copy link

I’m a little nervous looking at this that someone is going to immediately release a fluent API for these which will be preferred by the community, and STL will be left with a less used package. E.g

stream.New(iter).
     Filter(…).
     Map(…).

and so on. Maybe it’s ok if that ends up happening, but seems a bit sad.

@zephyrtronium
Copy link
Contributor

@hherman1 If it's any comfort, Map in particular doesn't work as a method because of #49085, so it seems unlikely that such a package will grow especially popular.

@hherman1
Copy link

That’s both comforting and distressing :)
The ergonomics of composing iterators seem a bit rough without a fluent api, but maybe I’m thinking too much about Java streams.

@willfaught
Copy link
Contributor

willfaught commented Aug 10, 2023

Please add syntax highlighting to your code blocks. With it, you might have noticed that Sorted should be SortedFunc here:

// SortedFunc collects values from seq into a new slice, sorts the slice, and returns it.
func Sorted[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem {

It seems like the functions that return iter types should be in the iter package, so they can be grouped under the Seq/Seq2 types in the generated documentation. These are technically "constructors" for those types in that sense. The functions that take iter types also seem like they belong in iter, as utility functions that work on iter types. I'm reminded of the io package, which declares Reader and various functions that produce and consume Readers.

Slices are built into the language, so it doesn't seem like a function producing or consuming a slice alone warrants it being in the slices package. The new go/token.File.Lines method returns a slice, yet I doubt it was ever considered whether it belonged in the slices package (being a method aside).

@hherman1
Copy link

I dont mind the colorless code blocks. The tone there also seems a bit rude @willfaught

@willfaught
Copy link
Contributor

The tone there also seems a bit rude @willfaught

My apologies.

@leaxoy
Copy link

leaxoy commented Aug 15, 2023

In most language, the All function should be Enumerate, why provide a counter-intuitive function in go? And should provide a function turn slice to Seq, for example:

func Iter[E any](slice []E) iter.Seq[E] {
    ...
}

// and for enumerate

func Enumerate[E any](slice []E) iter.Seq2[int, E] {
    // or if we had tuple type, return should be iter.Seq[(int, E)]
    // Again, I personally think that Seq2 is a very unwise design
    ...
}

@DeedleFake
Copy link

DeedleFake commented Aug 15, 2023

@leaxoy

I think you're forgetting the context. The function isn't All(); it's slices.All(). Enumerate() is the usual name for a function that takes an existing iterator and adds indices. For a function that takes a slice and iterated over it, I think All(), while not necessarily the best name, is more clear than Enumerate() would be.

@ChrisHines
Copy link
Contributor

I had not paid close attention to the names of the functions this proposal wants to add so far. In #61626 (comment), @rsc gives an example use as follows.

use(slices.Sorted(maps.Keys(m))

Somehow knowing that maps.Keys would return an iterator, but not having fully digested this proposal I was a bit disoriented. I wasn't expecting slices.Sorted to accept an iterator. The name just doesn't seem to convey that at first. The two concepts the name conveys are "slices" and "sorting".

I think that my initial confusion when reading the call site without having invested in already knowing the details of the slices.Sorted signature may be a sign that the name isn't quite right. How carefully chosen is that name? Can we do better?

@Merovius
Copy link
Contributor

@ChrisHines I think it's fairly common for a function to be named after what it returns and what it does, but less so after what it takes. slices.Sort sorts a slice (it's named after what it does) and slices.Sorted returns a sorted slice (it's named after what it returns).

There is also precedence: In python, sorted is a function taking an iterable and it returns a sorted list. And sort is a method on lists (and other containers) that sorts the list in-place.

To me, this seems fine.

@earthboundkid
Copy link
Contributor

Besides Python, JavaScript has Array.toSorted which returns a new Array that was sorted. So, two languages at least where "sort" means in place and "sorted" means "new slice/list/array".

@earthboundkid
Copy link
Contributor

Does Sorted need a stable variant? It does seem unfortunate to have so many sort variants:

  1. sort.Sort
  2. slices.Sort
  3. slices.SortFunc
  4. slices.SortStableFunc
  5. slices.Sorted
  6. slices.SortedFunc
  7. slices.SortedStableFunc

cc @eliben

@eliben
Copy link
Member

eliben commented Aug 16, 2023

Python's sorted is guaranteed to be stable, so they only have one sorted (not sorted and sortedstable) but that one sorted is stable

@willfaught
Copy link
Contributor

func Sorted[Elem comparable](seq iter.Seq[Elem]) []Elem {
	slice := Collect(seq)
	Sort(slice)
	return slice
}

I'm not a sorting expert, but I'm curious if using an online sorting algorithm here could be an improvement. There could be delays between iterations.

@earthboundkid
Copy link
Contributor

The heap package is in need of a total overhaul. It could incorporate iterators if that happened.

@earthboundkid
Copy link
Contributor

Maybe slices.SortedFunc should be stable by default. Anyone who wants a faster sort can write it manually, and stability bugs are more likely to matter in production than minor performance hits.

@rsc
Copy link
Contributor Author

rsc commented Aug 30, 2023

Finishing this proposal discussion is blocked on #61405.

@gopherbot
Copy link

Change https://go.dev/cl/558737 mentions this issue: slices: add All, Backward, Values, Append, Collect, Sorted, SortedFunc

@rsc
Copy link
Contributor Author

rsc commented Jan 31, 2024

Given slices.SortFunc exists and is unstable, it would be inconsistent for slices.SortedFunc to be stable. It would be especially annoying when moving between the two that the semantics change. If you replace a use of a stable SortedFunc with an unstable SortFunc, you'd be sad.

Given that slices.SortStableFunc exists, we should probably also add slices.SortedStableFunc to complete the set, so people don't have to remember which exist and which don't.

@rsc
Copy link
Contributor Author

rsc commented Feb 8, 2024

Have all remaining concerns about this proposal been addressed?

// All returns an iterator over index-value pairs in the slice.
// The indexes range in the usual order, from 0 through len(s)-1.
func All[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem]

// Backward returns an iterator over index-value pairs in the slice,
// traversing it backward. The indexes range from len(s)-1 down to 0.
func Backward[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem] 

// Values returns an iterator over the values in the slice,
// starting with s[0].
func Values[Slice ~[]Elem, Elem any](s Slice) iter.Seq[Elem]

// Append appends the values from seq to the slice and returns the extended slice.
func Append[Slice ~[]Elem, Elem any](x Slice, seq iter.Seq[Elem]) Slice 

// Collect collects values from seq into a new slice and returns it.
func Collect[Elem any](seq iter.Seq[Elem]) []Elem

// Sorted collects values from seq into a new slice, sorts the slice, and returns it.
func Sorted[Elem comparable](seq iter.Seq[Elem]) []Elem

// SortedFunc collects values from seq into a new slice, sorts the slice, and returns it.
func SortedFunc[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem 

// SortedStableFunc collects values from seq into a new slice, sorts the slice
// using a stable sort, and returns it.
func SortedStableFunc[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem 

@ianlancetaylor
Copy link
Contributor

When I look at the list today, I'm a little worried about slices.Append. I think it may be confusing that it's not like append. Maybe slices.AppendSeq? But I don't feel strongly about it.

@eliben
Copy link
Member

eliben commented Feb 10, 2024

When I look at the list today, I'm a little worried about slices.Append. I think it may be confusing that it's not like append. Maybe slices.AppendSeq? But I don't feel strongly about it.

slices.Append feels like the odd-one-out to me too. On the other hand, if we called it AppendSeq it would look different from Collect, which doesn't have a Seq suffix. Another option is AppendInto.

@earthboundkid
Copy link
Contributor

slices.AppendAll?

@mibk
Copy link
Contributor

mibk commented Feb 12, 2024

Or slices.AppendFrom.

@alexaandru
Copy link

I don't think that naming is the problem, but rather the fact that we have two different ways of doing the exact same thing (namely, appending to a slice). Besides, if people see an AppendFrom they might wonder where is the missing AppendTo or worse, maybe think append is when you want to append to a slice and AppendFrom when you want to append from a slice. It's just new vs make on steroids... And AppendAll begs the question: "did append() only append some and that's why AppendAll() was added?". I mean, if you need an append ALL, clearly "someone" didn't append enough before... or we wouldn't need it, right?

I think that the bigger issue is that before this, appending to a slice was a no brainer: append(). End of story. After this change, we got decisions to make. And they involve probably the most common operation on the most common data structure used by... everyone?

To me personally, the most natural thing would be for append() to continue to do what it does today: append stuff to slices. Even if that stuff is a sequence of bytes, of strings, of anys or just an iter.Seq, which itself is the very definition of a sequence. It wouldn't break backward compatibility, it wouldn't "overload" append with any new responsibility, it would simply continue to append a sequence of elements to a slice, regardless of how that sequence is expressed.

@gophun
Copy link

gophun commented Feb 12, 2024

To me personally, the most natural thing would be for append() to continue to do what it does today: append stuff to slices. Even if that stuff is a sequence of bytes, of strings, of anys or just an iter.Seq, which itself is the very definition of a sequence.

Does this append the iter.Seq itself or its elements?

var s []any
var seq iter.Seq[any]
s = append(s, seq)

You'd need a ... operator for iter.Seqs.

@Merovius
Copy link
Contributor

Merovius commented Feb 12, 2024

My counter-argument to changing append: As @gophun mentioned, you'd need to make it append(s, seq...), if nothing else, then for consistency. But then, the question becomes why you can pass an iter.Seq as a variadic argument to append, but not to, say fmt.Println. So we would want to really allow using an iter.Seq as a variadic argument in general. But then, the callee can use a variadic argument as a slice (including writing to it and randomly accessing it), which means the only way to implement that would be to essentially do fmt.Println(slices.Collect(seq)...), which is expensive - and the expense would probably frustrate those who don't treat the argument as a slice, but just read it once, in sequence.

So I don't really see a satisfying way to retro-fit iter.Seq into append.

Personally, I think slices.Append is fine as it is. The comparison to append is not awesome, but IMO not fatal.

@alexaandru
Copy link

Yes, I stand corrected, modifying append is not really what I was looking for, just having append(s, slices.Collect(iter.Seq)...) work, the append(s, iter.Seq) should do the appending of iter.Seq to a []iter.Seq as expected. No extra magic or anything like that.

Having slices.Append() also do that, more succinctly, doesn't really bother me (and since it appends, it may as well be called that, I don't see a need to look for variations of the name "append").

I apologize for the noise.

@mohamedattahri
Copy link
Contributor

mohamedattahri commented Feb 12, 2024 via email

@gophun
Copy link

gophun commented Feb 12, 2024

So we would want to really allow using an iter.Seq as a variadic argument in general. But then, the callee can use a variadic argument as a slice (including writing to it and randomly accessing it), which means the only way to implement that would be to essentially do fmt.Println(slices.Collect(seq)...), which is expensive - and the expense would probably frustrate those who don't treat the argument as a slice, but just read it once, in sequence.

The case of s = append(s, seq...) might be optimized by the compiler to avoid allocating an intermediate slice. But yes, in the general case collecting elements into a slice would be necessary unless the compiler can determine whether the elements are used as a slice or merely iterated over within the function. I don't know enough about compiler optimizations to judge whether this is feasible or not.

@rsc
Copy link
Contributor Author

rsc commented Feb 14, 2024

AppendSeq sounds fine; good point about confusion with builtin append.

@rsc
Copy link
Contributor Author

rsc commented Feb 14, 2024

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

// All returns an iterator over index-value pairs in the slice.
// The indexes range in the usual order, from 0 through len(s)-1.
func All[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem]

// Backward returns an iterator over index-value pairs in the slice,
// traversing it backward. The indexes range from len(s)-1 down to 0.
func Backward[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem] 

// Values returns an iterator over the values in the slice,
// starting with s[0].
func Values[Slice ~[]Elem, Elem any](s Slice) iter.Seq[Elem]

// AppendSeq appends the values from seq to the slice and returns the extended slice.
func AppendSeq[Slice ~[]Elem, Elem any](x Slice, seq iter.Seq[Elem]) Slice 

// Collect collects values from seq into a new slice and returns it.
func Collect[Elem any](seq iter.Seq[Elem]) []Elem

// Sorted collects values from seq into a new slice, sorts the slice, and returns it.
func Sorted[Elem comparable](seq iter.Seq[Elem]) []Elem

// SortedFunc collects values from seq into a new slice, sorts the slice, and returns it.
func SortedFunc[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem 

// SortedStableFunc collects values from seq into a new slice, sorts the slice
// using a stable sort, and returns it.
func SortedStableFunc[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem 

@rsc
Copy link
Contributor Author

rsc commented Mar 1, 2024

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

// All returns an iterator over index-value pairs in the slice.
// The indexes range in the usual order, from 0 through len(s)-1.
func All[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem]

// Backward returns an iterator over index-value pairs in the slice,
// traversing it backward. The indexes range from len(s)-1 down to 0.
func Backward[Slice ~[]Elem, Elem any](s Slice) iter.Seq2[int, Elem] 

// Values returns an iterator over the values in the slice,
// starting with s[0].
func Values[Slice ~[]Elem, Elem any](s Slice) iter.Seq[Elem]

// AppendSeq appends the values from seq to the slice and returns the extended slice.
func AppendSeq[Slice ~[]Elem, Elem any](x Slice, seq iter.Seq[Elem]) Slice 

// Collect collects values from seq into a new slice and returns it.
func Collect[Elem any](seq iter.Seq[Elem]) []Elem

// Sorted collects values from seq into a new slice, sorts the slice, and returns it.
func Sorted[Elem comparable](seq iter.Seq[Elem]) []Elem

// SortedFunc collects values from seq into a new slice, sorts the slice, and returns it.
func SortedFunc[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem 

// SortedStableFunc collects values from seq into a new slice, sorts the slice
// using a stable sort, and returns it.
func SortedStableFunc[Elem any](seq iter.Seq[Elem], cmp func(Elem, Elem) int) []Elem 

@rsc rsc changed the title proposal: slices: add iterator-related functions slices: add iterator-related functions Mar 1, 2024
@rsc rsc modified the milestones: Proposal, Backlog Mar 1, 2024
@gopherbot
Copy link

Change https://go.dev/cl/568477 mentions this issue: slices: add iterator-related functions

@mpx
Copy link
Contributor

mpx commented Mar 2, 2024

I suspect there is a typo in the API above. Sorted should probably take a cmp.Ordered rather than a comparable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests