What is most idiomatic way to create an iterator in Go?-ThrowExceptions

Exception or error:

One option is to use channels. Channels are like iterators in a way and you can iterate over them using range keyword. But when you find out you can’t break out of this loop without leaking goroutine the usage becomes limited.

What is the idiomatic way to create iterator pattern in go?

Edit:

The fundamental problem with channels is they are a push model. Iterator is is a pull model. You don’t have to tell iterator to stop. I’m looking for a way to iterate over collections in a nice expressive way. I would also like to chain iterators (map, filter, fold alternatives).

How to solve:

Channels are useful, but closures are often more suitable.

package main

import "fmt"

func main() {
    gen := newEven()
    fmt.Println(gen())
    fmt.Println(gen())
    fmt.Println(gen())
    gen = nil // release for garbage collection
}

func newEven() func() int {
    n := 0
    // closure captures variable n
    return func() int {
        n += 2
        return n
    }
}

Playground: http://play.golang.org/p/W7pG_HUOzw

Don’t like closures either? Use a named type with a method:

package main

import "fmt"

func main() {
    gen := even(0)
    fmt.Println(gen.next())
    fmt.Println(gen.next())
    fmt.Println(gen.next())
}

type even int

func (e *even) next() int {
    *e += 2
    return int(*e)
}

Playground: http://play.golang.org/p/o0lerLcAh3

There are tradeoffs among the three techniques so you can’t nominate one as idiomatic. Use whatever best meets your needs.

Chaining is easy because functions are first class objects. Here’s an extension of the closure example. I added a type intGen for integer generator which makes it clear where generator functions are used as arguments and return values. mapInt is defined in a general way to map any integer function to an integer generator. Other functions such as filter and fold could be defined similarly.

package main

import "fmt"

func main() {
    gen := mapInt(newEven(), square)
    fmt.Println(gen())
    fmt.Println(gen())
    fmt.Println(gen())
    gen = nil // release for garbage collection
}

type intGen func() int

func newEven() intGen {
    n := 0
    return func() int {
        n += 2
        return n
    }
}

func mapInt(g intGen, f func(int) int) intGen {
    return func() int {
        return f(g())
    }
}

func square(i int) int {
    return i * i
}

Playground: http://play.golang.org/p/L1OFm6JuX0

Answer:

TL;DR: Forget closures and channels, too slow. If the individual elements of your collection are accessible by index, go for the classic C iteration over an array-like type. If not, implement a stateful iterator.

I needed to iterate over some collection type for which the exact storage implementation is not set in stone yet. This, plus the zillions other reasons to abstract the implementation details from the client, lead me to do some testing with various iteration methods. Full code here, including some implementations that make use of errors as values. Here are the benchmark results:

  • classic C iteration over an array-like structure. The type provides the methods ValueAt() and Len():

    l := Len(collection)
    for i := 0; i < l; i++ { value := collection.ValueAt(i) }
    // benchmark result: 2492641 ns/op
    
  • Closure style iterator. The collection’s Iterator method returns a next() function (a closure over the collection and cursor) and a hasNext boolean. next() returns the next value and a hasNext boolean. Note that this runs much faster than using separate next() and hasNext() closures returning single values:

    for next, hasNext := collection.Iterator(); hasNext; {
        value, hasNext = next()
    }
    // benchmark result: 7966233 ns/op !!!
    
  • Stateful iterator. A simple struct with two data fields, the collection and a cursor, and two methods: Next() and HasNext(). This time the Iterator() method of the collection returns a pointer to a properly initialized iterator structure:

    for iter := collection.Iterator(); iter.HasNext(); {
        value := iter.Next()
    }
    // benchmark result: 4010607 ns/op
    

As much as I like closures, performance wise it’s a no-Go. As for design patterns, well, Gophers prefer the term “idiomatic way to do” stuff for good reason. Also grep the go source tree for iterators: with so few files that mention the name, iterators are definitely not a Go thing.

Also check out this page: http://ewencp.org/blog/golang-iterators/

Anyhow, interfaces do not help in any way here, unless you want to define some Iterable interface, but this is a completely different subject.

Answer:

TL;DR: Iterators are not idiomatic in Go. Leave them to other languages.

In depth then, the Wikipedia entry “Iterator pattern” begins, “In object-oriented programming, the iterator pattern is a design pattern…” Two red flags there: First, object oriented programming concepts often don’t translate well into Go, and second, many Go programmers don’t think much of design patterns. That first paragraph also includes “The iterator pattern decouples algorithms from containers”, but only after stating “an iterator [accesses] the container’s elements. Well which is it? If an algorithm is accessing the container’s elements it can hardly claim to be decoupled. The answer in many languages involves some kind of generics that allows the language to generalize over similar data structures. The answer in Go is interfaces. Interfaces enforce a stricter decoupling of algorithms and objects by denying access to structure and requiring all interaction be based on behavior. Behavior means capabilites expressed through methods on the data.

For a minimal iterator type the needed capability is a Next method. A Go interface can represent an iterator object by simply specifying this single method signature. If you want a container type to be iterable, it must satisfy the iterator interface by implementing all of the methods of the interface. (We only have one here, and in fact it is common for interfaces to have only a single method.)

A minimal working example:

package main

import "fmt"

// IntIterator is an iterator object.
// yes, it's just an interface.
type intIterator interface {
    Next() (value int, ok bool)
}

// IterableSlice is a container data structure
// that supports iteration.
// That is, it satisfies intIterator.
type iterableSlice struct {
    x int
    s []int
}

// iterableSlice.Next implements intIterator.Next,
// satisfying the interface.
func (s *iterableSlice) Next() (value int, ok bool) {
    s.x++
    if s.x >= len(s.s) {
        return 0, false
    }
    return s.s[s.x], true
}

// newSlice is a constructor that constructs an iterable
// container object from the native Go slice type.
func newSlice(s []int) *iterableSlice {
    return &iterableSlice{-1, s}
}

func main() {
    // Ds is just intIterator type.
    // It has no access to any data structure.
    var ds intIterator

    // Construct.  Assign the concrete result from newSlice
    // to the interface ds.  ds has a non-nil value now,
    // but still has no access to the structure of the
    // concrete type.
    ds = newSlice([]int{3, 1, 4})

    // iterate
    for {
        // Use behavior only.  Next returns values
        // but without insight as to how the values
        // might have been represented or might have
        // been computed.
        v, ok := ds.Next()
        if !ok {
            break
        }
        fmt.Println(v)
    }
}

Playground: http://play.golang.org/p/AFZzA7PRDR

This is the basic idea of interfaces, but it’s absurd overkill for iterating over a slice. In many cases where you would reach for an iterator in other languages, you write Go code using built in language primitives that iterate directly over basic types. Your code stays clear and concise. Where that gets complicated, consider what features you really need. Do you need to emit results from random places in some function? Channels provide a yield-like capability that allows that. Do you need infinite lists or lazy evaluation? Closures work great. Do you have different data types and you need them to transparently support the same operations? Interfaces deliver. With channels, functions, and interfaces all first class objects, these techniques are all easily composable. So what then is the most idiomatic way? It is to experiment with different techniques, get comfortable with them, and use whatever meets your needs in the simplest way possible. Iterators, in the object oriented sense anyway, are almost never simplest.

Answer:

You can break out without leaking by giving your goroutines a second channel for control messages. In the simplest case it’s just a chan bool. When you want the goroutine to stop, you send on this channel. Inside the goroutine, you put the iterator’s channel send and the listen on the control channel inside a select.

Here’s an example.

You can take this further by allowing different control messages, such as “skip”.

Your question is pretty abstract, so to say more, a concrete example would be helpful.

Answer:

Looking to the container/list package it looks like there is no way to do it. C-like way should be used if you iterate over object.

Something like this.

type Foo struct {
...
}

func (f *Foo) Next() int {
...
}

foo := Foo(10)

for f := foo.Next(); f >= 0; f = foo.Next() {
...
}

Answer:

Here’s a way I thought of to do it with channels and goroutines:

package main

import (
    "fmt"
)

func main() {
    c := nameIterator(3)
    for batch := range c {
        fmt.Println(batch)
    }
}

func nameIterator(batchSize int) <-chan []string {
    names := []string{"Cherry", "Cami", "Tildy", "Cory", "Ronnie", "Aleksandr", "Billie", "Reine", "Gilbertina", "Dotti"}

    c := make(chan []string)

    go func() {
        for i := 0; i < len(names); i++ {
            startIdx := i * batchSize
            endIdx := startIdx + batchSize

            if startIdx > len(names) {
                continue
            }
            if endIdx > len(names) {
                c <- names[startIdx:]
            } else {
                c <- names[startIdx:endIdx]
            }
        }

        close(c)
    }()

    return c
}

https://play.golang.org/p/M6NPT-hYPNd

I got the idea from Rob Pike’s Go Concurrency Patterns talk.

Answer:

The very fact that there are so many seemingly different solutions here means that the doesn’t seem to be an idiomatic way of doing it. I am starting my way in Go, and I thought there would be a way to tap into the power of range thing. Sadly, no.

Here’s what I came up with (it is similar to some of the solutions above)

// Node Basically, this is the iterator (or the head of it) 
// and the scaffolding for your itterable type
type Node struct {
    next *Node
}

func (node *Node) Next() (*Node, bool) {
    return node.next, node.next != nil
}

// Add add the next node
func (node *Node) Add(another *Node) {
    node.next = another
}

and here is how I use it:

node := &Node{}
node.Add(&Node{})

for goOn := true; goOn; node, goOn = node.Next() {
    fmt.Println(node)
}

Or probably a more elegant solution:

...
func (node *Node) Next() *Node {
    return node.next
}
...

for ; node != nil; node = node.Next() {
    fmt.Println(node)
}

Answer:

As other mates said you can work with channels to implement Generator design pattern which is what you’re looking for.

Generator functions

Channels and goroutines provide a natural substrate for implementing a form of
producer/producer pattern using generator functions. In this approach, a goroutine is
wrapped in a function which generates values that are sent via a channel returned by the
function. The consumer goroutine receives these values as they are generated.

Example extracted from Go Design Patterns For Real-World

package main

import (
    "fmt"
    "strings"
    )

func main() {
    data := []string{"Sphinx of black quartz, judge my vow", 
             "The sky is blue and the water too", 
             "Cozy lummox gives smart squid who asks for job pen",
             "Jackdaws love my big sphinx of quartz",
             "The quick onyx goblin jumps over the lazy dwarf"}
    histogram := make(map[string]int)
    words := words(data) // returns handle to data channel
    for word := range words { // Reads each word from channel every time
        histogram[word]++
    }   
    fmt.Println(histogram)
}

// Generator function that produces data
func words(data []string) <-chan string {
    out := make(chan string)
    // Go Routine
    go func() {
        defer close(out) // closes channel upon fn return
        for _, line := range data {
            words := strings.Split(line, " ")
            for _, word := range words {
                word = strings.ToLower(word)
                out <- word // Send word to channel 
            }
        }
     }()
     return out
}

https://play.golang.org/p/f0nynFWbEam

In this example, the generator function, declared as func words(data []string) <-
chan string
, returns a receive-only channel of string elements. The consumer function, in this instance main(), receives the data emitted by the generator function, which is processed using a for…range loop.

An improved version of this design pattern:

https://play.golang.org/p/uyUfz3ALO6J

Adding methods like Next and Error:

type iterator struct {
    valueChan   <-chan interface{}
    okChan      <-chan bool
    errChan     <-chan error
    err     error
}

func (i *iterator) next() (interface{}, bool) {
    var (
        value   interface{}
        ok  bool
    )
    value, ok, i.err = <-i.valueChan, <-i.okChan, <-i.errChan
    return value, ok
}

func (i *iterator) error() error {
    return i.err
}

// Generator function that produces data
func NewIterator(data []string) iterator {
    out := make(chan interface{})
    ok := make(chan bool)
    err := make(chan error)
    // Go Routine
    go func() {
        defer close(out) // closes channel upon fn return
        for _, line := range data {
            words := strings.Split(line, " ")
            for _, word := range words {
                word = strings.ToLower(word)
                out <- word // Send word to channel and waits for its reading
                ok <- true
                err <- nil // if there was any error, change its value
            }
        }
        out <- ""
        ok <- false
        err <- nil
     }()

     return iterator{ out, ok, err, nil }
}

Leave a Reply

Your email address will not be published. Required fields are marked *