Docker for Mac file system performance summary

One of the more controversial topics right now in the Docker community is the issue surrounding file system performance in the Docker for Mac application.

For a very long time users have been forced to use workarounds to speed up performance when dealing with slow read and write times.  For example, this thread has been open on the Docker forums for over a year now, describing the problem and various workarounds users have found during that time.  There have been blog posts describing various optimizations, as well as scripts and tools to alleviate some of the frustration around slow file system performance on Docker for Mac.

There is a great explanation from the Docker team that lays out the details of the file system performance issues and what the crux of the problem is right now.

At the highest level, there are two dimensions to file system performance: throughput (read/write IO) and latency (roundtrip time). In a traditional file system on a modern SSD, applications can generally expect throughput of a few GB/s. With large sequential IO operations, osxfs can achieve throughput of around 250 MB/s which, while not native speed, will not be the bottleneck for most applications which perform acceptably on HDDs.

The article later goes on to highlight the plan to improve performance along with a number of specific items for accomplishing this.

Under development, we have:

  1. A Linux kernel patch to reduce data path latency by 2/7 copies and 2/5 context switches
  2. Increased OS X integration to reduce the latency between the hypervisor and the file system server
  3. A server-side directory read cache to speed up traversal of large directories
  4. User-facing file system tracing capabilities so that you can send us recordings of slow workloads for analysis
  5. A growing performance test suite of real world use cases (more on this below in What you can do)
  6. Experimental support for using Linux’s inode, writeback, and page caches
  7. End-user controls to configure the coherence of subsets of cross-OS bind mounts without exposing all of the underlying complexity

Additionally, with the latest release of the Docker for Mac 17.04-ce-mac7 (April 6 2017) client, a new :cached flag has been introduced for volume mounts to help with read times for lots of files.  There is also work going on to introduce another :delegated flag to help speed up write times.

Initial user testing of the :cached flag has been good, and shown up to a 4x improvement in some cases.  You can follow this issue on Github to get the most up to date information.  There is some really good detail and discussion going on over there (towards the bottom of the issue is where the new flags are discussed).

Overall I think Docker has done a great job of keeping users informed and updated on the various aspects of the problem and has been steadily making progress in addressing the situation.  The container ecosystem is still very young so there will be growing pains along the way and I think the way that Docker has been handling things has been more than reasonable as they have consistently been making progress on addressing the issue and have been transparent in recent months about what’s going on and how they’re working on the problem.

Read More

awesome-rancher

Awesome lists are a newish phenomenon for organizing links and information that have been gaining popularity on Github.  You can read more about awesome lists here, but essentially they are curated lists of helpful links and other various resources that are put together by members of their communities.

There are lots of awesome lists these days on Github, and there are even awesome lists for non-tech categories, including recipes and video games. Oddly enough though through my own searching, I quickly discovered that there was no awesome list for Rancher.

So I decided to give a little bit back to the Rancher community in my own way.  If you aren’t aren’t familiar with Rancher, it is essentially a platform and orchestration engine for container based workloads.  Out of all of the container management and orchestration platforms that have been growing out of the container popularity explosion, using Rancher has definitely the most pleasant experience and easiest to use.  Also, if you are new to Rancher, please go check out the list on Github for more information, that is what it is there for.

Check out the awesome-rancher project here.

The goal for me and for the project is to make awesome-rancher the go to resource for finding useful information.  Therefore, I want awesome-rancher to be as easy to use and as complete as possible so new users have a good jumping off place when they decide to dive into Rancher.

If you are a Rancher community member or even just notice a key resource missing from the list, please either let me know with a comment here, on twitter/facebook, or just open a PR on the project itself (which is probably the easiest).

Read More

Configure a Rancher HAProxy health check

If you are familiar with ELB/ALB you will know that there are slight idiosyncrasies between the two.  For example, ELB allows you to health check a back end server by TCP port.  Basically allowing the user to check if a back end comes up and is listening on a specified port.  ALB is slightly different in its method for health checking.  ALB uses HTTP checks (layer 7) to ensure back end instances are up and listening.

This becomes a problem in Rancher, when you have multiple stacks in a single environment that are fronted by the Rancher HAProxy load balancer.  By default, the HAProxy config does not have a health check endpoint configured, so ALB is never able to know if the back end server is actually up and listening for requests.

A colleague and I  recently discovered a neat trick for solving this problem if you are fronting your environment with an ALB.  The solution to this conundrum is to sprinkle a little bit of custom configuration to the Rancher HAProxy config.

In Rancher, you can modify the live settings without downtime.  Click on the load balancer that sits behind the ALB and navigate to the Custom haproxy.cfg tab.

haproxy config

Modify the HAProxy config by adding the following:

# Use to report haproxy's status
defaults
    mode http
    monitor-uri /_ping

Click the “Edit” button to apply these changes and you should be all set.

Next, find the health check configuration for the associated ALB in the AWS console and add a check the the /_ping path on port 80 (or whichever port you are exposing/plan to listen on).  It should look similar to the following example.

Health checks

Below is an example that maps a DNS name to an internal Nginx container that is listening for requests on port 80.

HAPRoxy configuration

The check in ALB ensures that the HAProxy load balancer in Rancher is up and running before allowing traffic to be routed to it.  You can verify that your Rancher load balancer is working if the instances behind your ALB start showing a status of healthy in the AWS console.

NOTE: If you don’t have any apps initially behind the Rancher load balancer (or that are listening on the port specified in the health check) the AWS instances behind ALB will remain unhealthy until you add configuration in Rancher for the stacks to be exposed, as pictured above.

After setting up HAProxy, publicly accessible services in private Rancher environments can easily be managed by updating the HAProxy config.  Just add a dns name and a service to link to and HAProxy is able to figure out how and where to route requests to.  To map other services that aren’t listening on port 80, the process is  very similar.  Use the above as a guideline and simply update the target port to whichever port the app is listening on internally.

Read More

Backup Route 53 zones

We have all heard about DNS catastrophes.  I just read about horror story on reddit the other day, where an Azure root DNS zone was accidentally deleted with no backup.  I experienced a similar disaster a few years ago – a simple DNS change managed to knock out internal DNS for an entire domain, which contained hundreds of records.  Reading the post hit close to home, uncovering some of my own past anxiety, so I began poking around for solutions.  Immediately, I noticed that backing up DNS records is usually skipped over as part of the backup process.  Folks just tend to never do it, for whatever reason.

I did discover, though, that backing up DNS is easy.  So I decided to fix the problem.

I wrote a simple shell script that dumps out all Route53 zones for a given AWS account to a json file, and uploads the zones to an S3 bucket.  The script is a handful lines, which is perfect because it doesn’t take much effort to potentially save your bacon.

If you don’t host DNS in AWS, the script can be modified to work for other DNS providers (assuming they have public API’s).

Here’s the script:

#!/usr/bin/env bash

set -e

# Dump route 53 zones to a text file and upload to S3.

BACKUP_DIR=/home/<user>/dns-backup
BACKUP_BUCKET=<bucket>
# Use full paths for cron
CLIPATH="/usr/local/bin"

# Dump all zones to a file and upload to s3
function backup_all_zones () {
  local zones
  # Enumerate all zones
  zones=$($CLIPATH/aws route53 list-hosted-zones | jq -r '.HostedZones[].Id' | sed "s/\/hostedzone\///")
  for zone in $zones; do
  echo "Backing up zone $zone"
  $CLIPATH/aws route53 list-resource-record-sets --hosted-zone-id $zone > $BACKUP_DIR/$zone.json
  done

  # Upload backups to s3
  $CLIPATH/aws s3 cp $BACKUP_DIR s3://$BACKUP_BUCKET --recursive --sse
}

# Create backup directory if it doesn't exist
mkdir -p $BACKUP_DIR
# Backup up all the things
time backup_all_zones

Be sure to update the <user> and <bucket> in the script to match your own environment settings.  Dumping the DNS records to json is nice because it allows for a more programmatic way of working with the data.  This script can be run manually, but is much more useful if run automatically.  Just add the script to a cronjob and schedule it to dump DNS periodically.

For this script to work, the aws cli and jq need to be installed.  The installation is skipped in this post, but is trivial.  Refer to the links for instructions.

The aws cli needs to be configured to use an API key with read access from Route53 and the ability to write to S3.  Details are skipped for this step as well – be sure to consult the AWS documentation on setting up IAM permissions for help with setting up API keys.  Another, simplified approach is to use a pre-existing key with admin credentials (not recommended).

Read More

Backing up Jenkins configurations to S3

If you have worked with Jenkins for any extended length of time you quickly realize that the Jenkins server configurations can become complicated.  If the server ever breaks and you don’t have a good backup of all the configuration files, it can be extremely painful to recreate all of the jobs that you have configured.  And most recently if you have started using the Jenkins workflow libraries, all of your custom scripts and coding will disappear if you don’t back it up.

Luckily, backing up your Jenkins job configurations is a fairly simple and straight forward process.  Today I will cover one quick and dirty way to backup configs using a Jenkins job.

There are some AWS plugins that will backup your Jenkins configurations but I found that it was just as easy to write a little bit of bash to do the backup, especially since I wanted to backup to S3, which none of the plugins I looked at handle.  In genereal, the plugins I looked at either felt a little bit too heavy for what I was trying to accomplish or didn’t offer the functionality I was looking for.

If you are still interested in using a plugin, here are a few to check out:

Keep reading if none of the above plugins look like a good fit.

The first step is to install the needed dependencies on your Jenkins server.  For the backup method that I will be covering, the only tools that need to be installed are aws cli, tar and rsync.  Tar and rsync should already be installed and to get the aws cli you can download and install it with pip, from the Jenkins server that has the configurations you want to back up.

pip install awscli

After the prerequisites have been installed, you will need to create your Jenkins job.  Click New Item -> Freestyle and input a name for the new job.

jenkins job name

Then you will need to configure the job.

The first step will be figuring out how often you want to run this backup.  A simple strategy would be to backup once a day.  The once per day strategy is illustrated below.

backup periodically

Note the ‘H’ above means to randomize when the job runs over the hour so that if other jobs were configured they would try to space out the load.

The next step is to backup the Jenkins files.  The logic is all written in bash so if you are familiar it should be easy to follow along.

# Delete all files in the workspace
rm -rf *

# Create a directory for the job definitions
mkdir -p $BUILD_ID/jobs

# Copy global configuration files into the workspace
cp $JENKINS_HOME/*.xml $BUILD_ID/

# Copy keys and secrets into the workspace
cp $JENKINS_HOME/identity.key.enc $BUILD_ID/
cp $JENKINS_HOME/secret.key $BUILD_ID/
cp $JENKINS_HOME/secret.key.not-so-secret $BUILD_ID/
cp -r $JENKINS_HOME/secrets $BUILD_ID/

# Copy user configuration files into the workspace
cp -r $JENKINS_HOME/users $BUILD_ID/

# Copy custom Pipeline workflow libraries
cp -r $JENKINS_HOME/workflow-libs $BUILD_ID

# Copy job definitions into the workspace
rsync -am --include='config.xml' --include='*/' --prune-empty-dirs --exclude='*' $JENKINS_HOME/jobs/ $BUILD_ID/jobs/

# Create an archive from all copied files (since the S3 plugin cannot copy folders recursively)
tar czf jenkins-configuration.tar.gz $BUILD_ID/

# Remove the directory so only the tar.gz gets copied to S3
rm -rf $BUILD_ID

Note that I am not backing up the job history because the history isn’t important for my uses.  If the history IS important, make sure to add a line to backup those locations.  Likewise, feel free to modify and/or update anything else in the script if it suits your needs any better.

The last step is to copy the backup to another location.  This is why we installed aws cli earlier.  So here I am just uploading the tar file to an S3 bucket, which is versioned (look up how to configure bucket versioning if you’re not familiar).

export AWS_DEFAULT_REGION="xxx"
export AWS_ACCESS_KEY_ID="xxx"
export AWS_SECRET_ACCESS_KEY="xxx"

# Upload archive to S3
echo "Uploading archive to S3"
aws s3 cp jenkins-configuration.tar.gz s3://<bucket>/jenkins-backup/

# Remove tar.gz after it gets uploaded to S3
rm -rf jenkins-configuration.tar.gz

Replace the AWS_DEFAULT_REGION with the region where the bucket lives (typically us-east-1), make sure to update the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to use an account with access to write to AWS S3 (not covered here).  The final thing to note, <bucket> should be replaced to use your own bucket.

The backup process itself is usually pretty fast unless the Jenkins server has a massive amount of jobs and configurations.  Once you have configured the job, feel free to run it once to test if it works.  If the job worked and returns as completed, go check your S3 bucket and make sure the tar.gz file was uploaded.  If you are using versioning there should just be one file, and if you choose the “show versions” option you will see something similar to the following.

s3 backup

If everything went okay with your backup and upload to s3 you are done.  Common issues configuring this backup method are choosing the correct AWS bucket, region and credentials.  Also, double check where all of your Jenkins configurations live in case there aren’t in a standard location.

Read More