C# Behind the scenes – `yield`

The first “Behind the scenes” article will be about yield keyword.

Lets take for example a very simple code that return numbers 1,2,3:

public IEnumerable<int> ExampleEnumerable()
{
    var arr = new int[] { 1, 2, 3 };
    foreach (var i in arr)
    {
        yield return i;
    }
}

The compiler translates this 4 lines code into an entire class.

(I will not put here the actually generated code, because it will show unclear to those who do not familiar with generated code. And it would be useless. Instead, I’ll put here code that is close enough to the generated code.)

The ExampleEnumerable method will be like this:

public IEnumerable<int> ExampleEnumerable()
{
    Yield.YeildGeneratedCode yeildExample = new Yield.YeildGeneratedCode(-2);
    yeildExample._this = this;
    return (IEnumerable<int>)yeildExample;
}

And this is the YeildGeneratedCode class: (with explanation comments)

class YeildGeneratedCode : IEnumerable<int>, IEnumerable, IEnumerator<int>, IDisposable, IEnumerator
{
    private int _state; // The state of the enumerator. see explanation later. 
    private int _current; // The current item
    private readonly int _initialThreadId; // For thread safety. see GetEnumerator later.
    public Yield _this; // This is the variable that saving the instance of the original code. In this case its Yield object and not YeildGeneratedCode object
    private int[] _collection; //The collection with items to enumerate
    private int _index; // Index of the current enumeration

    int IEnumerator<int>.Current => _current; // Return the current item (last item the has set in the last MoveNext call)

    object IEnumerator.Current => (object)_current;

    public YeildGeneratedCode(int state)
    {
        // Initialize a new Enumerable. The state set to -2, meaning the
        // enumerator hasn't create yet and set the thread id for thread safety
        _state = state;
        _initialThreadId = Environment.CurrentManagedThreadId;
    }

    void IDisposable.Dispose()
    {
        // Use if needed
    }

    bool IEnumerator.MoveNext()
    {
        switch (_state)
        {
            case 0: // Initialized. Change to running and initialize the collection and the index
                _state = -1;
                _collection = new int[] { 1, 2, 3 };
                _index = 0;
                break;
            case 1: // Suspend. Change to running and increase the index
                _state = -1;
                _index++;
                break;
            default: // Other states
                return false;
        }
        if (_index < _collection.Length)
        {
            // If there is a items, set current and change state to suspend
            _current = _collection[_index];
            _state = 1;
            return true;
        }
        _collection = null;
        return false;
    }

    void IEnumerator.Reset()
    {
        // This is not the real implementation, 
        // I just show you an example of what you can do here. 
        // The regular implementation is to throw NotSupported exception
        _state = 0;
        _index = 0;
    }

    IEnumerator<int> IEnumerable<int>.GetEnumerator()
    {
        YeildGeneratedCode yeildGeneratedCode;
        if (_state == -2 && _initialThreadId == Environment.CurrentManagedThreadId)
        {
            // If the enumerator yet not created and the the 
            // original thread is same as the current calling thread, 
            // change the state to initialized and return this
            _state = 0;
            yeildGeneratedCode = this;
        }
        else
        {
            // In other case, return a new Enumerator
            yeildGeneratedCode = new YeildGeneratedCode(0);
            yeildGeneratedCode._this = _this;
        }
        return yeildGeneratedCode;
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return ((IEnumerable<int>)this).GetEnumerator();
    }
}

When a method return an IEnumerable interface, the method does not return the actual values immediately, but an enumerator block is created and return. When the MoveNext is invoked (via foreach for example) the actual operation done.

Enumerator is a state machine with states:
Initialized (before) = 0
Running = -1
Suspend = 1
(can be also a close' orafter` state)

In each MoveNext call, if the state is in initialized or suspend, the state is changing to running and then the next item (if any) will return. (‘running’ and ‘close’ states will not return value).

The Reset method can potentially reset the enumerator to some state but it’s not implemented in yield.

To make the picture complete, lets see the IL code of the original method:

IL_0000: ldc.i4.s -2 // Load the state. 2 for uninitialized yet
IL_0002: newobj instance void Yield.YeildGeneratedCode'::.ctor(int32) // create the generated object
IL_0007: dup
IL_0008: ldarg.0 // load 'this'
IL_0009: stfld class Yield Yield.YeildGeneratedCode'::'<>_this' // store 'this' in this field of the generated class
IL_000e: ret

As you see, in the method we don’t see any loop. Just create the object and return it.
If someone later will write foreach(var item in enumerableObject) then the
enumerator will be created and MoveNext will be call in each iteration.

One more example of the same code but that return the Ienumerable like this:

public IEnumerable<int> ExampleEnumerable()
{
    yield return 1;
    yield return 2;
    yield return 3;
}

In this case, the switch in the MoveNext will look like this:

switch (_state)
{
  case 0:
    _state = -1;
    _current = 1;
    _state = 1;
    return true;
  case 1:
    _state = -1;
    _current = 2;
    _state = 2;
    return true;
  case 2:
     _state = -1;
     _current = 3;
     _state = 3;
     return true;
  case 3:
     _state = -1;
     return false;
  default:
     return false;
}

Here there is no need to check if there is item to return, all is known from the start, so just return the item in the order of the yield statements in the original method (that will be the cuurent item in each MoveNext call).
Note that here the state veritable is a kind of index.


If you want to investigate a more complex examples, try to understand what will happen in this case:

foreach (var i in new int[] { 1, 2 })
{
    foreach (var i1 in new int[] { 1, 2, 3 })
    {
        yield return i1 * i;
    }
}

Have fun until the next post

See more C# behind the scenes articles

Advertisements
This entry was posted in .NET, c# and tagged , , , , . Bookmark the permalink.

One Response to C# Behind the scenes – `yield`

  1. Pingback: C# Behind the scenes | My Coding Place

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s