Lyon's Blog
  • Home
  • Tags
  • Archives

Gevent Internals

Introduction

Gevent is an efficient coroutine-based asynchronous concurrency framework, it is based on libev and greenlet. If you don't know how greenlet works, recommend to read this. If you only want to know how to use gevent, recommend you read http://sdiehl.github.io/gevent-tutorial/.

Greenlet diagram

First an example from official website.

>>> import gevent
>>> from gevent import socket
>>> urls = ['www.google.com', 'www.example.com', 'www.python.org']
>>> jobs = [gevent.spawn(socket.gethostbyname, url) for url in urls]
>>> gevent.joinall(jobs, timeout=2)
>>> [job.value for job in jobs]
['74.125.79.106', '208.77.188.166', '82.94.164.162']

In this example gevent retrieve three ips from web concurrently. jobs = [gevent.spawn(socket.gethostbyname, url) for url in urls] create three task greenlets, and gevent.joinall(jobs, timeout=2) wait all the three task greenlets finished.

This is the greenlet diagram of the program.

                             +-----+
                      +----> | task|
                      |      +-----+
                      |            
+-------+      +-----++      +-----+
| main  +----> | hub |-----> | task|
+-------+      +-----++      +-----+
                      |            
                      |      +-----+
                      +----> | task|
                             +-----+

How it works

main is the default greenlet, task is the greenlet that do the real job, hub greenlet is the core of gevent, it cooperates all other greenlets, including main. So what is hub? hub is where gevent use libev. When gevent initializes, it create hub first, and hub is singleton, only one hub exists in the thread.

def get_hub(*args, **kwargs):
    """Return the hub for the current thread.

    If hub does not exists in the current thread, the new one is created with call to :meth:`get_hub_class`.
    """
    global _threadlocal
    try:
        return _threadlocal.hub
    except AttributeError:
        hubtype = get_hub_class()
        hub = _threadlocal.hub = hubtype(*args, **kwargs)
        return hub

When use do gevent.spawn, gevent will create a task greenlet using hub as parent.

spawn = Greenlet.spawn

class Greenlet(greenlet):
    """A light-weight cooperatively-scheduled execution unit."""

    def __init__(self, run=None, *args, **kwargs):
        hub = get_hub()
        greenlet.__init__(self, parent=hub)

    def start(self):
        """Schedule the greenlet to run in this loop iteration"""
        self._start_event = self.parent.loop.run_callback(self.switch)

    @classmethod
    def spawn(cls, *args, **kwargs):
        """Return a new :class:`Greenlet` object, scheduled to start.

        The arguments are passed to :meth:`Greenlet.__init__`.
        """
        g = cls(*args, **kwargs)
        g.start()
        return g

At the begining, we said gevent use libev, here self.parent.loop.run_callback(self.switch), loop is from libev, we register the new created task greenlet to loop. when the loop runs, it will execute the task greenlet.

So when loop runs, when main do some block operations, it will switch to hub,

def run(self):
    assert self is getcurrent(), 'Do not call Hub.run() directly'
    while True:
        loop = self.loop
        loop.error_handler = self
        try:
            loop.run()
        finally:
            loop.error_handler = None  # break the refcount cycle
        self.parent.throw(LoopExit('This operation would block forever'))

loop.run() will be executed, so loop can choose one greenlet to run, when the greenlet is also block or finished, it will choose another one to run.

Next explain monkeypatch

Ref

http://blog.segmentfault.com/fantix/1190000000613814

https://github.com/surfly/gevent

Comments
comments powered by Disqus

  • « HTTP keep-alive
  • Hight Throughput Web Server Analysis and Tunning »

Published

Sep 23, 2014

Contact

  • Powered by Pelican. Theme: Elegant by Talha Mansoor