Menu

Zarkov is an event logger

Over the past few weeks I've been working on a service in Python that I'm calling, in the tradition of naming projects after characters in Flash Gordon, Zarkov. So what exactly is Zarkov? Well, Zarkov is many things (and may grow to more):

  • Zarkov is an event logger
  • Zarkov is a lightweight map-reduce framework
  • Zarkov is an aggregation service
  • Zarkov is a webservice

In the next few posts, I'll be going over each of the components of Zarkov and how they work together. Today, I'll focus on Zarkov as an event logger.

Technologies

So there are just a few prerequisite technologies you should know something about before working with Zarkov. I'll give a brief overview of these here.

  • ZeroMQ: ZeroMQ is used for Zarkov's wire and buffering protocol all over the place. Generally you'll use PUSH sockets to send data and events to Zarkov, and REQ sockets to talk to the Zarkov map-reduce router.
  • MongoDB: Zarkov uses MongoDB to store events and aggregates, so you should have a MongoDB server handy if you'll be doing anything with Zarkov. We also use Ming, an object-document mapper developed at SourceForge, to do most of our interfacing with MongoDB.
  • Gevent: Internally, Zarkov uses gevent's "green threads" to keep things nice and lightweight. If you're just using Zarkov, you probably don't need to know a lot about gevent, but if you start hacking on the source code, it's all over the place (as well as special ZeroMQ and Ming adapters for gevent). So it's probably good to have at least a passing familiarity.

Installation

In order to install Zarkov, you'll need to be able to install ZeroMQ and gevent, which probably means installing the zeromq and libevent development libs. In Ubuntu, I had to install zeromq2-1 from source (which isn't too tough):

$ wget http://download.zeromq.org/zeromq-2.1.7.tar.gz
$ tar xzf zeromq-2.1.7.tar.gz
$ cd zeromq-2.1.7
$ ./configure --prefix=/usr/local && make
$ sudo make install
$ # if you're on ubuntu, this next line will work
$  sudo apt-get install libevent-dev
$ # otherwise you need to
$ wget http://monkey.org/~provos/libevent-1.4.13-stable.tar.gz
$ tar xzf libevent-1.4.13-stable.tar.gz
$ cd libevent-1.4.13-stable
$ ./configure --prefix=/usr/local && make
$ sudo make install

Now you should be able to do a regular pip install to get everything else:

$ virtualenv zarkov
$ source zarkov/bin/activate
(zarkov) $ pip install Zarkov

Next, you should customize your development.yaml file. Here's a convenient example we use in testing:

bson_bind_address: tcp://0.0.0.0:6543
json_bind_address: tcp://0.0.0.0:6544
web_port: 8081
backdoor: 127.0.0.1:6545
mongo_uri: mongodb://localhost:27017
mongo_database: zarkov
verbose: true
incremental: 0
zmr:
        req_uri: tcp://127.0.0.1:5555
        req_bind: tcp://0.0.0.0:5555
        worker_uri: tcp://0.0.0.0
        local_workers: 2
        job_root: /tmp/zmr
        map_page_size: 250000000
        map_job_size: 10000
        outstanding_maps: 16
        outstanding_reduces: 16
        request_greenlets: 16
        compress: 0 # compression level
        src_port: 0 # choose a random port
        sink_port: 0 # choose a random port
        processes_per_worker: null # default == # of cpus

Zarkov defines a format for an event stream which tries to be fairly generic (though our main use-case is logging SourceForge events for later aggregation). A Zarkov event is a BSON object containing the following data:

  • timestamp (datetime) : when did the event occur?
  • type (str): what is the type of event?
  • context (object): in what context did the event occur? On SourceForge, this includes the project context, the user logged in, the IP address, etc.
  • extra (whatever): this is purely up to the event generator. It might be a string, integer, object, array, whatever. (It should be supported by BSON, of course.)

The Zarkov events are stored in a MongoDB database (again with the Flash Gordon references). Assuming you've already installed Zarkov, to run the server you'd execute the following command::

(zarkov) $ zcmd -y development.yaml serve

Now to test, you can use the file zsend.py (included with Zarkov) to send a message to the server::

(zarkov) $ echo '{"type":"nop"}' | zsend.py tcp://localhost:6543

To confirm it got there, you can use the 'shell' subcommand from zcmd:

(zarkov) $ zcmd -y development.yaml shell

Then, in the shell you're given, execute the following commands:

In [1]: ZM.event.m.find().all()
Out[1]:
[{'_id': ObjectId('4e2723eeb240217416000001'),
  'aggregates': [],
  'context': {},
  'jobs': [],
  'timestamp': datetime.datetime(2011, 7, 20, 18, 52, 30, 272000),
  'type': u'nop'}]

(Your _id value will probably be different). To actually use Zarkov as an event logger, you'll probably want to actually send the ZeroMQ messages yourself. Zarkov includes a client to do just that. From the zcmd shell:

In [1]: from zarkov import client
In [2]: conn = client.ZarkovClient('tcp://localhost:6543')
In [3]: conn.event('nop', {'sample_context_key': 'sample_context_val'})
In [4]: ZM.event.m.find().all()
Out[4]: 
[{'_id': ObjectId('4e2723eeb240217416000001'),
  'aggregates': [],
  'context': {},
  'jobs': [],
  'timestamp': datetime.datetime(2011, 7, 20, 18, 52, 30, 272000),
  'type': u'nop'},
 {'_id': ObjectId('4e2725a8b240217483000001'),
  'aggregates': [],
  'context': {u'sample_context_key': u'sample_context_val'},
  'extra': None,
  'jobs': [],
  'timestamp': datetime.datetime(2011, 7, 20, 18, 59, 52, 756000),
  'type': u'nop'},

If you want to customize things further, the ZarkovClient code is actualy quite
short:

'''Python client for zarkov.'''
import zmq

import bson

class ZarkovClient(object):

    def __init__(self, addr):
        context = zmq.Context.instance()
        self._sock = context.socket(zmq.PUSH)
        self._sock.connect(addr)

    def event(self, type, context, extra=None):
        obj = dict(
            type=type, context=context, extra=extra)
        self._sock.send(bson.BSON.encode(obj))

    def event_noval(self, type, context, extra=None):
        from zarkov import model
        obj = model.event.make(dict(
                type=type,
                context=context,
                extra=extra))
        obj['$command'] = 'event_noval'
        self._sock.send(bson.BSON.encode(obj))

    def _command(self, cmd, **kw):
        d = dict(kw)
        d['$command'] = cmd
        self._sock.send(bson.BSON.encode(d))
Posted by Rick Copeland ☕ 2011-07-20 Labels: zarkov mongodb ming zeromq gevent

Anonymous
Anonymous

Add attachments
Cancel