writing and testing custom ansible libraries

This post is about testing custom ansible library code. Specifically the tests I wrote while working on ansible-rails, a wrapper utility for bundler and rake useful when deploying rails applications with ansible.

Sadly the documentation on how to test custom library code is really thin. Hopefully this will change but for now, here’s how I write and test custom libraries using a very simplistic example.

Most ansible libaries will look somewhat like this:

# inside library/magic

#!/usr/bin/python
# -*- coding: utf-8 -*-

DOCUMENTATION = '''some documentation'''
EXAMPLES = '''some usage examples'''

import os
# more imports

def do_magic(module):
  return { 'changed': True, 'option_was': module.params.get('an_option', None) }

def main():
  module = AnsibleModule(
    argument_Spec = dict(
      an_option     = dict(required=False, type='str'),
    )
  )
  result = do_magic(module)
  module.exit_json(**result)

# include magic from lib/ansible/module_common.py
#<<INCLUDE_ANSIBLE_MODULE_COMMON>>
main()

To prepare our do_magic method to be easily testable I like to move it into a class which accepts the ansible module in the constructor. That way I can inject a fake ansible module during my tests easily:

class MagicModule(object):
  module = None
  def __init__(self, module):
    self.module = module

  # more methods here
  def do_magic(self):
    return { 'changed': True, 'option_was': self.module.params.get('an_option', None) }

def main():
  module = AnsibleModule(
    argument_Spec = dict(
      an_option     = dict(required=False, type='str'),
    )
  )
  magic_module = MagicModule(module)
  result = magic_module.do_magic(module)
  module.exit_json(**result)

Next off we need to stop executing ansible every time we load our file. We do that by guarding the call to main with a conditional:

# include magic from lib/ansible/module_common.py
#<<INCLUDE_ANSIBLE_MODULE_COMMON>>
if __name__ == '__main__':
    main()

Now we can start writing a simple unit test. There’s only one problem I’ve stumbled over: ansible seems to infer the library name from the filename.

This is a problem because inside your playbook you’d rather write magic: an_option="what to ask?" than magic.py: an_option="what to ask?".

Now removing the file extension breaks pythons import functionality, and we need to work around this by using imp:

# inside test/magic_test.py
# -*- coding: utf-8 -*-
import unittest
import imp

imp.load_source('magic', os.path.join(os.path.dirname(__file__), os.path.pardir, 'library','magic'))
from magic import MagicModule

class FakeAnsibleModule(object):
  check_mode = False
  params = {}

class TestBase(unittest.TestCase):
  def test_do_magic(self):
    fake_ansible = FakeAnsibleModule()
    fake_ansible.params['an_option'] = 'Hello World'
    magic_module = MagicModule(fake_ansible)

    result = magic_module.do_magic()
    assert result['changed'] == True
    assert result['option_was'] == 'Hello World'

Running the test also requires a changed PYTHONPATH to work:

PYTHONPATH=$PWD/library python test/magic_test.py

While this is not perfect I’d rather test my custom library code than leaving it untested.

That’s about all. To wrap up:

  • wrap your logic inside a class, accepting AnsibleModule as constructor argument for easy dependency injection.
  • load the module using imp
  • test the logic and commands sent using mock or similar.


scraping nsscreencast

Today I decided to catch up with NSScreencast. NSScreencast has currently over 100 episodes and I stopped somewhere at 40.

I like to watch them offline when I have time but downloading the missing ~60+ episodes by hand would take too long. Having recently installed GNU Parallels I decided to give it a spin.

My goal was to generate PDFs of all episodes as well as to download the mp4 files of every episode.

This blog post is a short documentation of how one might scrap NSScreencast without fancy tools. If you are a subscriber of NSScreencast and like to watch the episodes offline this might help you too.

Generate PDFs of every episode (timing parallel)

Note: the episode description can be accessed without authentication.
I wanted to use wkhtmltopdf to generate the PDFs so I first checked if it worked as desired:

wkhtmltopdf http://www.nsscreencast.com/episodes/1 1.pdf

The PDF was generated without problems and looked good.

Now I used a regular bash loop once to time the entire process:

time for i in {1..102}; do wkhtmltopdf http://www.nsscreencast.com/episodes/$i $i.pdf; done
real  7m47.588s
user  1m46.571s
sys  0m17.431s

This generated all 102 PDFs in the current working directory. But it was quite slow.

For comparison here’s all PDF generation with parallel (YMMV):

time parallel wkhtmltopdf http://www.nsscreencast.com/episodes/{} {}.pdf ::: {1..102}
real  1m7.893s
user  1m58.293s
sys  0m19.502s

That’s incredible fast - and it was enough for me to give up on regular bash loops for this use case because it would only get worse when downloading huge movie files.

Authenticate with NSScreencast

Note: the episodes itself are behind a Paywall.
Since I’m a paying subscriber I needed to login first to retrieve a valid session. Looking at Chromes developer tools I copied the sign-in request as cURL to extract the Set-Cookie header:

EMAIL=example@me.com
PASSWORD=awesomeSecretPassword

COOKIE=$(curl -D header -d 'email=$EMAIL&password=$PASWORD' \
  'https://www.nsscreencast.com/user_sessions' > /dev/null && \
  cat header | grep Set-Cookie | sed 's/Set-Cookie: \(.*\); d.*/\1/' && \
  rm header)

I had to use sed to clean the Set-Cookie directive to only contain the key=value pair so I could later on feed it into cURL again.

Scraping NSScreencast

Having a valid session cookie stored in $COOKIE I again looked at chrome to get the URL of a episode. As it turned out I could reuse the entire URL and just append “.mp4” to it, which would redirect me to the correct episodes video.

Running cURL once validated this:

curl -O -LJ -b $COOKIE http://www.nsscreencast.com/episodes/1.mp4

Seconds later the first episode finished downloading. Sweet.

Using parallel I speed up the whole process and downloaded all of them:

parallel curl -O -LJ -b $COOKIE http://www.nsscreencast.com/episodes/{}.mp4 ::: {1..102}

This saved me tons of time and as NSScreencast progresses I can continue to watch the videos offline.

Takeaway: GNU parallel is easy to use and speeds things up considerably!


deploying rails applications with ansible

Fellow web developers might know ansible - a lightweight, extensible solution for automating your application provisioning.

Since ansible galaxy opened its doors sharing roles became very easy which is why I started sharing the deployment and rollback roles I’m using for some Ruby on Rails projects I’m working on.

First, some thoughts on why I switched deployments from mina to ansible:

development of mina stagnated over the past year

Mina still lacks some important features like rollbacks, while other features like multi host deployments are only possible through hacks. Since most pull requests aren’t accepted by the maintainers of mina this situation didn’t improve at all.

Most of the hacks which are necessary when working with mina are not necessary when using ansible.

Now ansible is much harder to debug as soon as you start extending it with custom library modules, but most users probably won’t reach this situation until they’ve spent some time with ansible due to the huge amount of modules already available. To me this is a much better situation to work with.

deploying Ruby on Rails applications with ansible

Looking at capistrano and mina most deployments consist of only three steps:

  • checking out a new version of your application
  • applying changes to your environment (database migrations, assets, …)
  • restarting the application server to pick up code changes

Depending on your application every step can involve some complex operations, like zero downtime deployments with dropped database columns in between. These more complex problems require special handling and you need to take care of this.

ansible-rails-deployment executes the first two steps and also handles some additional setup which might be required:

  1. it takes care of a proper $GEM_HOME and $PATH setup. This is important because all necessary binaries (bundle, gem, …) need to be locatable using $PATH, otherwise ansible won’t find the binaries.

  2. then it makes sure that all necessary folders exist, like {{ deploy_to }}, {{ deploy_to }}/shared, {{ deploy_to }}/releases, and all other folders you might need. This can be adjusted using the directories variable.

  3. it ensures all configuration files exist & are up to date. This step uses ansibles template directive and can be configured using the templates variable.

  4. now that the pre-requirements are met, a bare copy of your git repository is created or updated, and then cloned into a separate build directory.

  5. a production bundle is created to make sure that all production dependencies are installed properly.

  6. the database is migrated and the assets are generated. Both steps are configurable using the migrate and compile_assets variable, and default to true

  7. if all prior steps succeeded, the deployment is considered a success, and the current symlink is updated to the new location.

  8. lastly, old versions of your code are removed. only the 5 most recent versions are kept.

The application restart needs to be handled in a separate role, which you write yourself, because ansible does not support dynamic (notify-)handler invocation.

rolling back Ruby on Rails deployments with ansible

ansible-deployment-rollback only needs to change the symlink of the current release to the prior release and restart the application. Assuming all other side-effects are taken care of roles you write yourself.

Thus, a typical, minimal deployment playbook, which supports rollbacks as well, looks like this:

---
- hosts: server
  user: app-user
  gather_facts: False
  vars:
    user: app-user
    home_directory: "/home/{{ user }}"
    rails_env: "staging"

  roles:
    -
      role: nicolai86.ansible-rails-deployment

      repo: git@github.com:awesome-app
      branch: master

      deploy_to: "{{ home_directory }}/app"

      symlinks:
        -
          src: "{{ shared_path }}/log"
          dest: "{{ build_path }}/log"
        -
          src: "{{ shared_path }}/config/database.yml"
          dest: "{{ build_path }}/config/database.yml"
        -
          src: "{{ shared_path }}/vendor/bundle"
          dest: "{{ build_path }}/vendor/bundle"

      directories:
        - "{{ shared_path }}/log"
        - "{{ shared_path }}/config"
        - "{{ shared_path }}/vendor"
        - "{{ shared_path }}/vendor/bundle"

      templates:
        -
          src: "templates/config/database.js"
          dest: "{{ shared_path }}/config/database.yml"

      tags: deploy

    -
      role: nicolai86.ansible-deployment-rollback

      deploy_to: "{{ home_directory }}/app"

      tags: rollback

    -
      role: restart

      service: my-app:*

      tags:
        - deploy
        - rollback

This takes care of simple deployments for you, and also integrates nicely with your existing ansible provisioning.

Using tags you can either deploy, or rollback. Just make sure to specify the correct tag.

When deploying, as a small bonus, migration and asset compilation only takes place if there are differences between the current and next version (this is checked using the diff command, just like mina does).

If deployments require special handling (like zero downtime deployments with loadbalancers in between) you can easily fit that in your restart roll, or add a new roles prior/ posterior to the deployment.

The same goes for rollbacks: run your down-migrations in a separate role prior to restarting, and everything is fine.

This leaves us with a simple, and extensible group of roles which can ease deployments and rollbacks - no matter how complicated things get.

2014/02/12 take a look at this gist for a current usage example


new opportunities in life

As of today I’m no longer an employee at weluse GmbH. It was a fun time but like all fun times it’s time for a change.

After deciding to apply for a position as software developer at mindmatters I’m really happy to announce that I’ll be joining them starting December 2013. That’s like next week ;)

Aside from being a little nervous because of, well, change I’m also extremly happy that I’ll be joining a team of highly skilled people, working on interesting and demanding projects.

I’m thrilled :)