I have been doing some Ruby programming as part of my recent work ventures, and I think a nice way to learn a new language is by asking questions on how to map your previous understanding on how to do things into this new environment, and discovering new ways of doing things in the process.
Python Generators
Having been programming in Python for the best part of the last 4 years, I got used to generators (or, equivalently, generator functions) as a convenient way of structuring certain computations. Generators are akin to interruptible computations which can yield control at specified points in time and then be resumed later as required. Further, whenever a generator yields control, it can return a value.
This makes them particularly suited to the implementation of iterators whose values are generated from a computation, on-the-fly. Let’s look at a Python example of an infinite enumeration, a generator which yields one integer number after the other until infinity:
The call to yield
in the function above means that every time the generator hits that line, it will yield control to the caller and return the current value for i
:
## The first three elements are: 5 6 7
Another way of looking at a computation running as a generator function is as a thread that can voluntarily yield control. A group of such generators could then be coordinated so that when one yields, another one runs, effectively transforming groups of generators into a mechanism for cooperative multitasking. Indeed, the more recent work on Python coroutines (part of asyncio) leverages the generator infrastructure very much this way. But enough about Python.
What about Ruby?
Well, the Ruby language does not have generators. It does have Fibers, which can accomplish pretty much the same thing1, but, more interestingly from a learning perspective, it has native support for continuations.
Continuations are a powerful control flow construct which allows you to “save” the current state of a computation and then resume it from that saved state whenever you like. This is similar to what generators do, except that with a generator you have a structured relationship between two computations which progress taking turns, as depicted in Fig. 1: on one end we have the caller computation, which transfers control to the generator whenever it calls next
and, on the other, we have the generator computation, which transfers control back to the caller whenever it calls yield
.
That is not what happens with continuations. Continuations will simply transfer control from the caller to an arbitrary saved state, forgetting that the caller ever existed. In particular, it is perfectly possible for a computation to transfer control back to an earlier point of itself (Fig. 2). In such situations, the only trace you will have that something else has happened when resuming said saved state are the eventual side effects that the preceding computation may have left behind before jumping back.
This leads to the widely cited “continuation sandwich” example:
In Ruby, in particular, any changes made to variables which were in scope when the continuation got created will be preserved.
To get a more concrete idea of what all this means, let us examine the program below:
require 'continuation'
$cc = nil
results = []
def fun
i = 0
callcc { |cc| $cc = cc }
i += 1
i
end
results << r = fun
$cc.call unless r >= 10
puts 'Result is: ' + results.join(', ')
## /usr/lib/x86_64-linux-gnu/ruby/2.7.0/continuation.so: warning: callcc is obsolete; use Fiber instead
## Result is: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
We begin by looking at line 12, where we call function fun
. This function: i) declares a variable named i
(line 6); ii) creates and stores a continuation in global variable $cc
with callcc (line 7); increments and returns i
(lines 8-9). We then store the result into local variable r
, and push it into the array results
. The first call will produce a 1.
Next, we hit line 13 and call back the continuation with $cc.call
. This causes us to jump back to when the continuation got created; i.e., back into line 7. At that point, there is no way to tell that we got there from a jump and are not visiting line 7 for the first time, except that the value of i
will be already 1 (recall that changes made to variables in scope are preserved, even if such changes are made after the continuation was created). i
will be then incremented again and the function will return from the original call again in line 12 again, except that it will return a 2 instead of a 1 this time.
This will happen another 8 times, until finally fun
will return 10, causing line 13 to be skipped and the result, an array with numbers 1 to 10 (our sandwich!), to be printed in line 14.
I hope that this example illustrates just how confusing a program that uses continuations, which are often compared to GOTO statements, can get. The runtime complexity required to support them is also significant, which is the reason why continuations got deprecated in Ruby about five years ago, but are still hanging in there.
Regardless, it should also be relatively clear that continuations are a more general concept than generators, and that it should be possible to implement generators on top of them if one so desires. Indeed, this is where we are headed next.
Ruby Generators with call/cc
The main issue with pure continuations is that they are a kind of one-way construct: the caller of the continuation can warp back to the past where the continuation was created, but it is not possible to just do some computation and then warp back to the future with a value. We can achieve that, however, by adding a second continuation to store caller state.
The general idea is that we use one continuation A to store the state of the generator’s computation when it yields, and another continuation B to store the state of the caller when it calls next
. The generator can then call continuation B and warp back to the caller when it yields, whilst the caller can call continuation A and warp back to the generator when it calls next
. This idea is expressed into code below:
require 'continuation'
##
# Python-style generators in Ruby using continuations.
class RubyGen
##
# Creates a new generator.
#
# = Example
#
# gen = RubyGen.new(3, 5) do |gen, x, y|
# (x..y).each { |i| gen.yield(i) }
# end
#
# gen.next # returns 3
# gen.next # returns 4
# gen.next # returns 5
# gen.next # raises StopIteration
#
def initialize(*pars, &block)
@warp_generator = nil # Continuation A
@warp_caller = nil # Continuation B
@exception = nil
@block = -> do
begin
block.call(self, *pars)
# Not exactly great Ruby etiquette...
# but we're just messing around.
@exception = StopIteration
rescue => e
@exception = e
ensure
# Don't forget to transfer control back
# to caller when the block ends, whichever
# way it ends.
@warp_caller.call
end
end
end
def yield(value)
@value = value
callcc do |cc|
# Saves generator state so we can "warp into"
# the generator on the next call to `next`.
@warp_generator = cc
# Warps back to caller.
@warp_caller.call(value)
end
end
def next
value = callcc do |cc|
# Saves caller state so we can "warp back" to the caller
# when the generator yields.
@warp_caller = cc
# Warps into the generator.
@warp_generator.nil? ? @block.call : @warp_generator.call
end
raise @exception unless @exception.nil?
value
end
end
## /usr/lib/x86_64-linux-gnu/ruby/2.7.0/continuation.so: warning: callcc is obsolete; use Fiber instead
To implement the same generator as we had in our Python example, we would then write:
and finally, running:
will print The first three elements are: 5 6 7
, just like in our Python example.
Fibers can actually do more. Like the continuation-based generators we develop here, they snapshot the entire call stack. This means that if you call a second function from your generator, and then that fuction calls a third function, and then this third function calls
yield
, things are still going to work: the call toyield
will return control to the caller, and the next call tonext
will resume from inside of the third function call, with the call stack just as it was. Not so with Python generators, which will not work across nested calls.↩︎