How We Test: Per Bug Environments

Wed 03 February 2016

When I started, we had three environments: dev, test, and production. Having everyone push to dev caused problems with incomplete and buggy code being migrated along with working code when branches were merged to test and eventually production. To alleviate this, and QA to pass or fail bugs individually, we replaced those three environments with a lot of environments.

We still have production and test. But now we have separate test environments for each bug or feature being worked. At any point in the development process, we can create one of these "Per Bug Environments"(or PBEs) and let QA or product owners see our progress or questions.

What's in a PBE?

We've wrapped PBE jobs around two apps, one with Postgres one with MySQL. One of the apps uses the other as a back end, so in our case we build both for each bug.

What we use:

  • Jenkins
  • GitLab
  • wildcard DNS
  • sanitized DB dumps
  • Apache VirtualHosts
  • Several gnarly bash scripts
  • Redmine
  • wildcard SSL certificate

How we work

Developers push their working branch to their GitLab fork. In GitLab, they open a merge request against the master repo. If all goes well, QA finds links to the PBE pair of sites in the related bug in Redmine. When QA passes a bug, it's merged to test, retested, and merged to production.

Triggering a PBE job

The backbone of this system is a Jenkins job that wraps all the scripts necessary to build a running copy of the app. Whenever there is a new or updated merge request in the monitored repo, the Gitlab Merge Request Builder Plugin triggers this job.

Orchestrating the app

We use the branch name as part of the host name and configuration of each PBE, so branch bug1234 becomes https://bug1234.testdomain.net/, which is backed by a database named bug1234 and lives in /var/www/pbe/bug1234. On the PBE host, the job create databases and populate directory trees. If the job there are no errors, a python script regex'es the branch name and posts a message on the referenced Redmine bug.

DNS, SSL, and Apache

We already had a test domain, and a wildcard SSL certificate, so we reused those. DNS is pretty straight forward: point A or CNAME records at IPs as usual for existing subdomains and a single * (wildcard) record listing the IP of the PBE host. Our production site is 100% SSL all the time, so our PBEs are the same.

To keep from having to create and remove unique bug1234_vhost.conf files for each PBE, I used the mass virtual hosting feature of Apache. The config file looks something like this:

<VirtualHost *:443>
  ServerAlias *.testdomain.net
  VirtualDocumentRoot /var/www/pbe/%1/www
</VirtualHost>

Odds & Ends

I hope to eventually remove the test phase and merge passed bugs straight to into production.

The PBE process as we've built it isn't perfect, it eats a lot of disk space. There is a "garbage collection" job that cleans up old PBEs, but it's currently run by hand when the disks get full.

We've talked about spinning up a tiny AWS instance for each PBE and build it's environment there instead of creating them all on one large PBE host. This would replace one type of garbage collection with another, dead instances. Each new instance would have it's own IP, which could give us an excuse to play with Amazon's Route 53 DNS.

The scripts right now are bash-flavored duct tape. I think production, the PBEs, and our local Vagrant instances would be much improved by converting them over to Ansible or Chef recipes.