Designing Ansible for flexibility and simplicity

When I was first getting started with Ansible, I struggled a lot with the layout for my projects. The Ansible best practices docs were a little bit of help (they’ve gotten way better), but like most “best practices” they solve for a very simple use case.

Google generally wasn’t much help either. Most of what I found solved for “I am running a two tier application on a couple of machines”.

It sounds like it would be a simple problem to solve, but thinking through all the requirements, there are a lot of things to consider.

I need a layout that is flexible with some areas of enforced consistency, covering multiple app stacks in different clouds, accounts, and environments. I also need it to be as simple and easy-to-understand as possible, so I can hand it off to others. Finally, I need something that can be self-contained for easy portability and testing.

So I came up with my own layout, and so far, it seems to work pretty well.

└── ansible
    ├── ansible.cfg
    ├── callback_plugins
    ├── library
    ├── environments
    │   ├── biz_unit_1
    │   │   ├── dev
    │   │   │   ├── group_vars
    │   │   │   │   ├── dev.yml
    │   │   │   │   └── secret.yml
    │   │   │   └── host_vars
    │   │   │       ├── api_server.yml
    │   │   │       ├── db.yml
    │   │   │       ├── router.yml
    │   │   │       └── webserver.yml
    │   │   ├── prod
    │   │   ├── qa
    │   │   ├── shared
    │   │   ├── api_server.yml
    │   │   ├── db.yml
    │   │   ├── router.yml
    │   │   └── webserver.yml
    │   ├── biz_unit_2
    │   └── biz_unit_3
    ├── global_vars
    ├── inv
    │   ├── ec2.ini
    │   ├──
    │   └── hosts
    └── roles
        ├── apache2
        ├── general_server_config
        ├── java_jdk
        ├── mysql
        ├── nginx
        ├── postfix
        └── varnish

* a very pared-down and sanitized version of what 
I work in day to day

Let’s break this down a little to explain the rationale.

The root directory (I’d recommend tossing it in /opt/ansible) and its contents are all stored in source control, including ansible.cfg, a callback_plugins directory, and a library directory. This means ALL Ansible config and customizations can be version controlled and made portable.  The library directory may, in turn, reference git sub-projects for different modules, but can be included in the main repo either way.

I’ve seen a lot of guides that say to put everything in /etc/ansible with the Ansible binary. This will make your Ansible install fragile and difficult to port/share. I’d recommend against it.

Environments contains folders for each business unit or application family (whichever is appropriate). The biz unit folders have sub-folders for their environments (dev, qa, prod). These folders are where the Ansible group_vars and host_vars live and allow for consistent references from roles and the CLI in the form of /environments/{{ bu_name }}/{{ env }}/host_vars/{{ app_role }}.yml

Secret.yml files can be added at whatever level makes sense to hold the Ansible Vault-encrypted variables for the environment or host.

The root of each biz unit’s folder contains the playbooks for each host/stack. These are not variables, but the references to roles which in turn consume the environmental variables.

Example: /environments/biz_unit_1/webserver.yml would be something like:

- hosts: tag_Role_webserver
    - general_server_config
    - java_jdk
    - tomcat
    - '{{ env }}/host_vars/webserver.yml'
    - '{{ env }}/group_vars/{{ env }}.yml'

Global_vars contains what it says on the tin. These are variables that apply to all hosts and/or environments.

Inv – Dynamic inventory scripts for cloud providers live here, as does a static inventory file with a couple of localhost-specific settings in it (mostly done as a workaround to some buggy Ansible variable parsing).

You will find docs that suggest that you put host vars into an inventory file. This doesn’t make sense unless you are dealing with a static, traditional environment, and even then it can get ugly.

Doing as much as you can with the dynamic inventories will minimize the number of vars you need to keep track of and prevent situations were you are working against old data (that’s the dynamic part).

Roles is, of course, where roles live.  I generally use ansible-galaxy init rolename  to create roles since it builds out the directory scaffolding and boilerplate files. This does create some extra cruft you might not want (similar to running rails generate), but I like the consistency it provides.

Outside of defaults, no variables should exist in roles. These are generic plays that should be able to run against any of your environments by referencing variables in the environmental vars. Putting environment-specific variables in your roles is a good way to make things break and creates a mess if you’d like to open source your role later.

Triggering plays

Setting up Ansible in this way makes it easy to share the same Ansible repo across an entire enterprise and trigger parameterized plays by passing different values to Jenkins (or Ansible Tower, or Rundeck, or whatever you use to schedule/execute your Ansible plays).

For example, you could setup an nginx deployment in any of your environments by adding a vars file with any environment-specific customization and feeding a Jenkins job the correct parameters for the destination environment.

It might look like this:

  1. User (could basically be a trained monkey) triggers the Jenkins job for “Build Infra” with params like “Sales”, “Prod”, “Router”. These params could be hard-coded if you wanted to make a true “one-click” deployment/re-deployment. Going further, you could also auto-trigger this job with a git webhook. (Sorry, monkey, no job for you.)
  2. Jenkins checks out the latest git commit for the Ansible repo and executes a play using the parameters that were fed to it.
  3. The Ansible play calls AWS APIs and SSH to spin up an instance, install and configure nginx, save the instance as an AMI, create a load balancer, create an autoscaling group with the customized AMI, and update DNS to direct traffic to the new load balancer. (If the stack already exists, it would be updated with the new settings.)

This is for an immutable workflow. A more traditional config-management scenario would be even simpler.

Other considerations

Another design I considered was using git submodules to slice up the different environments. Ultimately, this felt a little clunky and added unnecessary complexity for my use-cases. If the teams using Ansible can’t play well together or you don’t want all your vars and roles in one repo, this path might make sense.

Related to the above, this layout does require testing discipline for the core roles that are shared across environments. Borking up a shared “launch_EC2” role and checking it into master could annoy your teammates when they’re trying to figure out why their plays are failing.

I continue to look for ways to make the layout simpler, with fewer files and directories. At the moment, this is the simplest setup I could figure out that met all requirements. I’m definitely open to feedback though.