Category Archives: Bash

Change CoreOS default toolbox

This is a little trick that allows you to override the default base OS in the CoreOS “toolbox“.  The toolbox is a neat trick to allow you to debug and troubleshoot issues inside containers on CoreOS without having to do any outside work of setting up a container.

The default toolbox OS defaults to Fedora, which we’re going to change to Ubuntu.  There is a custom configuration file that will get read in via the .toolboxrc file, located at /home/core/.toolboxrc by default.  To keep things simple we will only be changing the few pieces of the config to get the toolbox to behave how we want.  More can be changed but we don’t really need to override anything else.

TOOLBOX_DOCKER_IMAGE=ubuntu
TOOLBOX_DOCKER_TAG=14.04

That’s pretty cool, but what if we want to have this config file be in place for all servers?  We don’t want to have to manually write this config file for every server we log in to.

To fix this issue we will add a simple configuration in to the user-data file that gets fed in to the CoreOS cloud-config when the server is created.  You can find more information about the CoreOS cloud-configs here.

The bit in the cloud config that needs to change is the following.

-write_files:
  - path: /home/core/.toolboxrc
    owner: core
    content: |
      TOOLBOX_DOCKER_IMAGE=ubuntu
      TOOLBOX_DOCKER_TAG=14.04

If you are already using cloud-config then this change should be easy, just add the bit starting with -path to your existing -write_files section.  New servers using this config will have the desired toolbox defaults.

This approach gives us an automated, reproducible way to clone our custom toolbox config to every server that uses cloud-config to bootstrap itself.  Once the config is in place simply run the “toolbox” command and it should use the custom values to pull the desired Ubuntu image.

Then you can run your Ubuntu commands and debugging tools from within the toolbox.  Everything else will be the same, we just use Ubuntu now as our default toolbox OS.  Here is the post that gave me the idea to do this originally.

Create a Kubernetes cluster on AWS and CoreOS with Terraform

Up until my recent discovery of Terraform, the process I had been using to test CoreOS and Kubernetes was somewhat cumbersome and manual.  There are still some manual steps and processes involved in the bootstrap and cluster creation process that need to get sorted out, but now I can bring environments up and down, quickly and automatically.  This is a HUGE time saver and also makes testing easier because these changes can happen in a matter of minutes rather than hours and can all be self documented for others to reference in a Github repo.  Great success.

NOTE:  This method seems to be broken as of the 0.14.2 release of Kubernetes.  The latest version I could get to work reliably was v0.13.1.  I am following the development and looking forward to the v1.0 release but won’t revisit this method until something stable has been shipped because there are still just too many changes going on.  With that said, v0.13.1 has a lot of useful functionality and this method is actually really easy to get working once you have the groundwork laid out.

Another benefit is that as the project develops and matures, the only thing that will need modified are the cloud configs I am using here.  So if you follow along you can use my configs as a template, feel free to use this as a base and modify the configs to get this working with a newer release.  As I said I will be revisiting the configs once things slow down a little and a v1 has been released.

Terraform

So the first component that we need to enable in this workflow is Terraform.  From their site, “Terraform is a tool for building, changing, and combining infrastructure safely and efficiently.”  Basically, Terraform is a command line tool for allowing you to implement your infrastructure as code across a variety of different infrastructure providers.  It should go without saying, being able to test environments across different platforms and cloud providers is a gigantic benefit.  It doesn’t lock you in to any one vendor and greatly helps simplify the process of creating complex infrastructures across different platforms.

Terraform is still a young project but has been maturing nicely and currently supports most of the functionality needed for this method to work (the missing stuff is in the dev pipeline and will be released in the near future).  Another benefit is that Terraform is much easier to use and understand than CloudFormation, which is  a propriety cloud provisioning tool available to AWS customers, which could be used if you are in a strictly AWS environment.

The first step is to download and install Terraform.  In this example I am using OSX but the instructions will be similar on Linux or other platforms.

cd /tmp
wget https://dl.bintray.com/mitchellh/terraform/terraform_0.3.7_darwin_amd64.zip
unzip terraform_0.3.7_darwin_amd64.zip
mv terraform* /usr/local/bin

After you have moved the binary you will need to source your shell.  I use zsh so I just ran “source ~/.zshrc” to update the path for terraform.

To test terraform out you can check the version to make sure it works.

terraform version

Now that Terraform is installed you will need to get some terraform files set up.  I suggest making a local terraform directory on your machine so you can create a repo out of it later if desired.  I like to split “services” up by creating different directories.  So within the terraform directory I have created a coreos directory as well as a kubernetes directory, each with their own variables file (which should be very similar).  I don’t know if this approach is a best practice but has been working well for me so far.  I will gladly update this workflow if there is a better way to do this.

Here is a sample of what the file and directory layout might look like.

cloud-config
  etcd-1.yml
  etcd-2.yml
  etcd-3.yml
  kube-master.yml
  kube-node.yml
etcd
  dns.tf
  etcd.tf
  variables.tf
kubernetes
  dns.tf
  kubernetes.tf
  variables.tf

As you can see there is a directory for Etcd as well as Kubernetes specific configurations.  You may also notice that there is a cloud-config directory.  This will be used as a central place to put configurations for the different services.

Etcd

With Terraform set up, the next component needed for this architecture to work is a functioning etcd cluster. I chose to use a separate 3 node cluster (spread across 3 AZ’s) for improved performance and resliency.  If one of the nodes goes down or away with a 3 node cluster it will still be operational, where if a 1 node cluster goes away you will be in much more trouble.  Additionally if you have other services or servers that need to leverage etcd you can just point them to this etcd cluster.

Luckily, with Terraform it is dead simple to spin up and down new clusters once you have your initial configurations set up and configured correctly.

At the time of this writing I am using the current stable version of CoreOS, which is 633.1.0, which uses version 0.4.8 of etcd.  According to the folks at CoreOS, the cloud configs for old versions of etcd should work once the new version has been released so moving to a the new 2.0 release should be easy once it hits the release channel but some tweaks or additional changes to the cloud configs may need to occur.

Configuration

Before we get in to the details of how all of this works, I would like to point out that many of the settings in these configuration files will be specific to users environments.  For example I am using an AWS VPC in the “us-east-1” region for this set up, so you may need to adjust some of the settings in these files to match your own scenario.  Other custom components may include security groups, subnet id’s, ssh keys, availability zones, etc.

Terraform offers resources for basically all network components on AWS so you could easily extend these configurations to build out your initial network and environment if you were starting a project like this from scratch.  You can check all the Terraform resources for the AWS provider here.

Warning: This guide assumes a few subtle things to work correctly.  The address scheme we are using for this environment is a 192.168.x.x, leveraging 3 subnets to spread the nodes out across for additional availability (b, c, e) in the US-East-1 AWS region.  Anything in the configuration that has been filled in with “XXX” represents a custom value that you will need to either create or obtain in your own environment and modify in the configuration files.

Finally, you will need to provide AWS credentials to allow Terraform to communicate with the API for creating and modifying resources.  You can see where these credentials should be filled in below in the variables.tf file.

variables.tf

variable "access_key" { 
 description = "AWS access key"
 default = "XXX"
}

variable "secret_key" { 
 description = "AWS secret access key"
 default = "XXX"
}

variable "region" {
 default = "us-east-1"
}

/* CoreOS AMI - 633.1.0 */

variable "amis" {
 description = "Base CoreOS AMI"
 default = {
 us-east-1 = "ami-d6033bbe" 
 }
}

Here is what an example CoreOS configs look like.

etcd.tf

provider "aws" {
 access_key = "${var.access_key}"
 secret_key = "${var.secret_key}"
 region = "${var.region}"
}

/* Etcd cluster */

resource "aws_instance" "etcd-01" {
 ami = "${lookup(var.amis, var.region)}"
 availability_zone = "us-east-1e" 
 instance_type = "t2.micro"
 subnet_id = "XXX"
 security_groups = ["XXX"]
 key_name = XXX"
 private_ip = "192.168.1.10"
 user_data = "${file("../cloud-config/etcd-1.yml")}"

 root_block_device = {
 device_name = "/dev/xvda"
 volume_type = "gp2"
 volume_size = "20"
 } 
}

resource "aws_instance" "etcd-02" {
 ami = "${lookup(var.amis, var.region)}"
 availability_zone = "us-east-1b" 
 instance_type = "t2.micro"
 subnet_id = "XXX"
 security_groups = ["XXX"]
 key_name = "XXX"
 private_ip = "192.168.2.10"
 user_data = "${file("../cloud-config/etcd-2.yml")}"

 root_block_device = {
 device_name = "/dev/xvda"
 volume_type = "gp2"
 volume_size = "20"
 } 
}

resource "aws_instance" "etcd-03" {
 ami = "${lookup(var.amis, var.region)}"
 availability_zone = "us-east-1c" 
 instance_type = "t2.micro"
 subnet_id = "XXX"
 security_groups = ["XXX"]
 key_name = "XXX"
 private_ip = "192.168.3.10"
 user_data = "${file("../cloud-config/etcd-3.yml")}"

 root_block_device = {
 device_name = "/dev/xvda"
 volume_type = "gp2"
 volume_size = "20"
 } 
}

Below I have created a configuration file as a simaple way to create DNS records dynamically when spinning up the etcd cluster nodes.

dns.tf

 resource "aws_route53_record" "etcd-01" {
 zone_id = "XXX"
 name = "etcd-01.example.domain"
 type = "A"
 ttl = "300"
 records = ["${aws_instance.etcd-01.private_ip}"]
}

resource "aws_route53_record" "etcd-02" {
 zone_id = "XXX"
 name = "etcd-02.example.domain"
 type = "A"
 ttl = "300"
 records = ["${aws_instance.etcd-02.private_ip}"]
}

resource "aws_route53_record" "etcd-03" {
 zone_id = "XXX"
 name = "etcd-03.example.domain"
 type = "A"
 ttl = "300"
 records = ["${aws_instance.etcd-03.private_ip}"]
}

Once all of the configurations have been put in place and all look right you can test out what your configuration will look like with the “plan” command:

cd etcd
terraform plan

Make sure to change in to your etcd directory first.  This will examine your current configuration and calculate any changes.  If your environment is completely unconfigured then this command will return some output that explains what terraform is planning to do.

If you don’t want the input prompts when you run your plan command you can append the “-input=false” flag to bypass the configurations.

If everything looks okay with the plan you can tell Terraform to “apply” your conifgs with the following:

terraform apply
OR
terraform apply -input=false

If everything goes accordingly, after a few minutes you should have a new 3 node etcd cluster running on the lastest stable version of CoreOS with DNS records for interacting with the nodes!  To double check that the servers are being created you can check the AWS console to see if your newly defined servers have been created.  The console is a great way to double check that things all work okay and that the right values were created.

If you are having trouble with the cloud configs check the end of the post for the link to all of the etcd and Kubernetes cloud configs.

Kubernetes

The Kubernetes configuration is very similar to etcd.  It uses a variables.tf, kubernetes.tf and dns.tf file to configure the Kubernetes cluster.

The following configurations will build a v0.13.1 Kubernetes cluster with 1 master, and 3 worker nodes to begin with.  This config can be extended easily to scale the number of worker nodes to basically as many as you want (I could easily image the hundreds or thousands), simply by changing a few number in the configuration, barely adding any overhead to our current process and workflow, which is nice.  Because of these possibilities, Terraform allows for a large amount of flexibility in how you manage your infrastructure.

This configuration is using c3.large instances so be aware that your AWS bill may be affected if you spin nodes up and fail to turn them off when you are done testing.

provider "aws" {
 access_key = "${var.access_key}"
 secret_key = "${var.secret_key}"
 region = "${var.region}"
}

/* Kubernetes cluster */

resource "aws_instance" "kube-master" {
 ami = "${lookup(var.amis, var.region)}"
 availability_zone = "us-east-1e" 
 instance_type = "c3.large"
 subnet_id = "XXX"
 security_groups = ["XXX"]
 key_name = "XXX"
 private_ip = "192.168.1.100"
 user_data = "${file("../cloud-config/kube-master.yml")}"

 root_block_device = {
 device_name = "/dev/xvda"
 volume_type = "gp2"
 volume_size = "20"
 } 
}

resource "aws_instance" "kube-e" {
 ami = "${lookup(var.amis, var.region)}"
 availability_zone = "us-east-1e" 
 instance_type = "c3.large"
 subnet_id = "XXX"
 security_groups = ["XXX"]
 key_name = "XXX"
 count = "1"
 user_data = "${file("../cloud-config/kube-node.yml")}"

 root_block_device = {
 device_name = "/dev/xvda"
 volume_type = "gp2"
 volume_size = "100"
 } 
}

resource "aws_instance" "kube-b" {
 ami = "${lookup(var.amis, var.region)}"
 availability_zone = "us-east-1b" 
 instance_type = "c3.large"
 subnet_id = "XXX"
 security_groups = ["XXX"]
 key_name = "XXX"
 count = "1"
 user_data = "${file("../cloud-config/kube-node.yml")}"

 root_block_device = {
 device_name = "/dev/xvda"
 volume_type = "gp2"
 volume_size = "100"
 } 
}

resource "aws_instance" "kube-c" {
 ami = "${lookup(var.amis, var.region)}"
 availability_zone = "us-east-1c" 
 instance_type = "c3.large"
 subnet_id = "XXX"
 security_groups = ["XXX"]
 key_name = "XXX"
 count = "1"
 user_data = "${file("../cloud-config/kube-node.yml")}"

 root_block_device = {
 device_name = "/dev/xvda"
 volume_type = "gp2"
 volume_size = "100"
 } 
}

And our DNS configuration.

resource "aws_route53_record" "kube-master" {
 zone_id = "XXX"
 name = "kube-master.example.domain"
 type = "A"
 ttl = "300"
 records = ["${aws_instance.kube-master.private_ip}"]
}

resource "aws_route53_record" "kube-e" {
 zone_id = "XXX"
 name = "kube-e-test.example.domain"
 type = "A"
 ttl = "300"
 records = ["${aws_instance.kube-e.0.private_ip}"]
}

resource "aws_route53_record" "kube-b" {
 zone_id = "XXX"
 name = "kube-b.example.domain"
 type = "A"
 ttl = "300"
 records = ["${aws_instance.kube-b.0.private_ip}"]
}

resource "aws_route53_record" "kube-c" {
 zone_id = "XXX"
 name = "kube-c.example.domain"
 type = "A"
 ttl = "300"
 records = ["${aws_instance.kube-c.0.private_ip}"]
}

The variables file for Kubernetes should be identical to the etcd configuration so I have chosen not to place it here.  Just refer to the previous etcd/variables.tf file.

Resources

Since each cloud-config is slightly different (and would take up a lot more space) I have included those files in the below gist.  You will need to populate the “ssh_authorized_keys:” section with your own SSH public key and update any of the IP addresses to reflect your environment.  I apologize if there are any typo’s, there was a lot of cut and paste.

Cloud configs – https://gist.github.com/jmreicha/7923c295ab6110151127

Much of the configurations that I am using are based on the Kubernetes docs, as well as some of the specific cloud configs that I have adapted, which can be found here.

Another great place to get help with Kubernetes is the IRC channel which can be found on irc.freenode.net in the #google-containers channel.  The folks that hang out there are super friendly and can almost always answer any questions you have.

As I said, development is still pretty crazy.  You can check the releases page to check out all the latest stuff.

Conclusion

Yes this can seem very convoluted at first but if everything works how it should, you now have a quick and easy way to spin up identical etcd and/or a Kubernetes environments up or down at will, which is pretty powerful.  Also, this method is dramatically easier than most of the methods I have come across so far in my own adventures and testing.

Once you get through the initial confusion and learning curve this workflow becomes a huge timesaver for testing out different versions of Kubernetes and also for experimenting with etcd.  I haven’t quite automated the entire process but I imagine that it would be easy to spin entire environments up and down by gluing all of these pieces together with some simple shell scripts.

If you need to make any configuration updates, for example to put a new version of Kubernetes in place, you will need first update your Kubernetes master/node cloud configs and then rerun terraform apply to have it recreate your environment.

The cloud config changes will destroy any nodes that rely on the old configuration.  Therefore, you will need to make sure that if you make any changes to your cloud config files you are prepared to deal with the consequences!  Ideally you should get your etcd cluster to a good spot and then leave it alone and just play around with the Kubernetes components since both of those components have been separated in order to change the components out independently.

With this workflow you can already start to see the power of terraform even with this one example.  Terraform is quickly becoming one of my favorite automation and cloud tools and is providing a very easy way to define and build infrastructure though code and configurations.

CLI hotkey and navigation guide

I have been meaning to write this post for quite a while now but have always managed to forget.  I have been piecing together useful terminal shortcuts, commands and productivity tools since I started using Linux back in the day.  If you spend any amount of time in the terminal you should hopefully know about some of these tricks already but more importantly, if you’re like me, are always looking for ways to improve the efficiency of your bash workflow and making your life easier.

There are a few things that I would quickly like to note.  If you use tmux as your CLI session manager you may not be able to use some of the mentioned hotkeys to get around by default if you don’t have some settings turned on in your configuration file.

You can take a look at my custom .tmux.conf file if you’re interested in screen style bindings plus configuration for hotkeys.  If you simply want to add the option that turns on the correct hotkey bindings for your terminal, add this line to your ~/.tmux.conf file

set-window-option -g xterm-keys on

Also, if you are a Mac user, and don’t already know about it, I highly recommend checking out iTerm2.  Coming primarily from a Linux background the hotkey bindings in Mac OS X are a little bit different than what I am used to and were initially a challenge for me to get accustomed to.  The transition for me took a little bit but iTerm has definitely helped me out immensely, as well as a few other ticks learned along the way.  I really haven’t dug through all the options in iTerm but there are a huge number of options and customizations that can made.

The only thing I have been interested in so far is the navigation which I will highlight below.

Adjust iTerm keybindings – As I mentioned, I am used to using Linux keybinding so a natural fit for my purposes is the option key.  The first step is to disable the custom binding in the iTerm preferences.  To do this, click iTerm -> Preferences -> Profiles -> Keys and find the binding for option left arrow and option right arrow and remove them from the default profile.

Next, add the following to your global key bindings, iTerm -> Preferences -> Keys.

iterm2

 

 

 

Move left one word

  • Keyboard shortcut: ??
  • Action: Send Escape Sequence
  • Escape: b

Move right one word

  • Keyboard shortcut: ??
  • Action: Send Escape Sequence
  • Escape: f

Finally, it is also worth pointing out that I use zsh for my default shell.  There are some really nice additions that zsh offers over vanilla bash.  I recently ran across this blog post which has some awesome tips.  I have also written about switching to zsh here.  Anyway, here is the lis.  It will grow as I find more tips.

Basic navigation:

  • Ctrl-left/right arrow – Jump between words quickly.
  • Opt-left/right arrow – Custom iTerm binding for jumping between words quickly.
  • Alt-left/right arrow – Linux only.  Jump between words quickly.
  • Esc-b/f – Mac OS.  Similar to arrow keys, move between words quickly.
  • Alt-b – Linux only.  Jump back one word.  Handy with other hotkeys overridden.
  • Ctrl-a – Jump to the beginning of a line (doesn’t work with tmux mappings).
  • Ctrl-e – Jump to the end of a line.
  • End – SImilar to ctrl-e this will send your cursor to the end of the line.
  • Home – Similar to End, except jumps to the beginning of the line.

Intermediate navigation:

  • Ctrl-u – Copy entire command to clipboard.
  • Ctrl-y – Paste previously copied ctrl-u command in to the terminal.
  • Ctrl-w – Cut a word to the left of the cursor.
  • Alt-d – Cut after word after the cursor position

Advanced use:

  • Ctrl-x Ctrl-e – Zsh command.  Edit the current command in your $EDITOR, which for me is vim
  • Ctrl-r – Everybody hopefully should know this one.  It is basically recursive search history
  • Ctrl-k – Erase everything after the current cursor position.  Handy for long commands
  • !<command>:p – Print the last command
  • cd … – Zsh command.  This can be easily aliased but will jump up two directories
  • !$ – Quickly access the last argument of the last command.

Zsh tab completion

Tab completion with Zsh is awesome, it’s like bash completion on steroids.  I will attempt to highlight some of my favorite tab completion tricks with Zsh.

Directory shorthand – Say you need to get to a directory that is nested deeply.  You can use the first few characters that will uniquely match to that directory to navigate instead of typing out the whole command.  So for example, cd /u/lo/b will expand out to /usr/local/bin.

command specific history – This one comes in handy all the time.  If you need to grab a command that you don’t use very often you can user Ctrl-r to match the first part of the command and from there you can up arrow to locate the command you typed.

Spelling and case correction – Bash by default can get annoying if you have a long command typed out but somehow managed to typo part of the command.  In zsh this is (sometimes) corrected for you automatically when you <tab> to complete the command.  For example if you are changing dirs in to the ‘Documents’ directory you can type ‘cd ~/doc/’ and the correct location will be expanded for you.

This list will continue to grow as I find more handy shortcuts, hotkeys or generally other useful tips and tricks that I find in my day to day command line work.  I really want to build a similar list for things in Vim but my Vim skills are unfortunately lacking plus there is already some really nice documentation and guidance out there already.  If you are interested in writing up a Vim productivity post I would love to post it.  Likewise, if you have any other nice shortcuts or tips you think are worth mentioning, post them in the comments and I will try to get them added to the list.

Add environment variable file to fig

If you haven’t heard of fig and are using Docker, you need to check it out.  Basically Fig is a tool that allows users to quickly create development environments using Docker.  Fig alleviates the complexity and tediousness of having to manually bring containers up and down, stitch them together and basically orchestrate a Docker environment.  On top of this, Fig offers some other cool functionality, for example, the ability to scale up applications.  I am excited to see what happens with the project because it was recently merged in to the Docker project.  My guess is that there will be many new features and additions to either Docker if Fig gets rolled in to the Docker core project.  For now, you can check out Fig here if you haven’t heard of it and are interested in learning more.

One issue that I have run in to is that there is currently not a great way to handle a large number of environment variables in fig.  In Docker there is an option that allows a user to pass in an environment variable file with the –env file <filename> flag.  To do the same with Fig in its current form, you are forced to list out each individual environment variable in your configuration which can quickly become tedious and confusing.

There is a PR out for adding in the ability to pass an environment variable file in to fig via the env-file option in a fig.yml file.  This approach to me is much easier than adding each environmental variable separately to the configuration with the environment option as well as having to update the fig.yml configuration if any of the values ever change.  I know that functionality like this will get merged eventually but until then I have been using the PR as a workaround to solve this issue, I think that this is also a good opportunity to show people how to get a project working manually with custom changes.  Luckily the fix isn’t difficult.

This post will assume that you have git, python and pip installed.  If you don’t have these tools installed go ahead and get that done.  The first step is to clone the fig project on github on to your local machine, see above for the link to the PR.

git clone git@github.com:docker/fig.git

Jump in to the fig project you just cloned and edit the service.py file.  This is the file that handles the processing of environment variables.  There are a few sections that need to be updated in the config.  Check the PR to be sure, but at the time of the writing, the following code should be added.

Line 55

- supported_options = DOCKER_CONFIG_KEYS + ['build', 'expose']
+ supported_options = DOCKER_CONFIG_KEYS + ['build', 'expose', 'env-file']

Line 318

+ def _get_environment(self):
+ env = {}
+
+ if 'env-file' in self.options:
+ env = env_vars_from_file(self.options['env-file'])
+
+ if 'environment' in self.options:
+ if isinstance(self.options['environment'], list):
+ env.update(dict(split_env(e) for e in self.options['environment']))
+ else:
+ env.update(self.options['environment'])
+
+ return dict(resolve_env(k, v) for k, v in env.iteritems())
+

LIne 352

- if 'environment' in container_options:
- if isinstance(container_options['environment'], list):
- container_options['environment'] = dict(split_env(e) for e in container_options['environment'])
- container_options['environment'] = dict(resolve_env(k, v) for k, v in container_options['environment'].iteritems())
+ container_options['environment'] = self._get_environment()

Line 518

+
+
+def env_vars_from_file(filename):
+ """
+ Read in a line delimited file of environment variables.
+ """
+ env = {}
+ for line in open(filename, 'r'):
+ line = line.strip()
+ if line and not line.startswith('#') and '=' in line:
+ k, v = line.split('=', 1)
+ env[k] = v
+ return env

That should be it.  Now you should be able to install fig with the new changes.  Make sure you are in the root fig directory that contains the setup.py file.

sudo python setup.py develop

Now you should be able to edit your fig.yml file to reflect the changes that have been added to fig via env-file.  Here is what a sample configuration might look like.

testcontainer:
 image: username/testcontainer
 ports:
 - "8080:8080"
 links:
 - "mongodb:mongodb"
 env-file: "/home/username/test_vars"

Notice that nothing else changed.  But instead of having to list out environment variables one at a time we can simply read in a file.  I have found this to be very useful for my workflow, I hope others can either adapt this or use this as well.  I have a feeling this will get merged in to fig at some point but for now this workaround works.

Introduction to boot2docker

boot2docker

If you work on a Mac (or Windows) and use Docker then you probably have heard of boot2docker.  If you haven’t heard of it before, boot2docker is basically a super lightweight Linux VM that is designed to run Docker containers.  Unfortunately there is no support (yet) in Mac OS X or Windows kernels for running Docker natively so this lightweight VM must be used as an intermediary layer that allows the host Operating Systems to communicate with the Docker daemon running inside the VM.  This solution really is not that limiting once you get introduced to and become comfortable with boot2docker and how to work around some of the current limitations.

Because Docker itself is such a new piece of software, the ecosystem and surrounding environment is still expanding and growing rapidly.  As such, the tooling has not had a great deal of time to mature.  So with pretty much anything that’s new, especially in the software and Open Source world, there are definitely some nuances and some things to be aware of when working with boot2docker.

That being said, the boot2docker project bridges a gap and does a great job of bringing Docker to an otherwise incompatible platform as well as making it easy to use across platforms, which especially useful for furthering the adoption of Docker among Mac and Windows users.

When getting started with boot2docker, it is important to note that there are a few different things going on under the hood.

Components

The first component is VirtualBox.  If you are familiar with virtual machines, there’s pretty much nothing new here.  It is the underpinning of running the VM and is a common tool for creating and managing VM’s.  One important note here about VBox.  This is currently the key to make volume sharing work with boot2docker to allow a user to pass local directories and files in to containers using its shared folder implementation.  Unfortunately it has been pretty well documented that vboxsf (shared folders) have not great performance when compared to other solutions.  This is something that the boot2docker team is aware of and working on for future reference.  I have a workaround that I will outline below if anyone happens to hit some of these performance issues.

The next component is the VM.  This is a super light weight image based on Tiny Core Linux and the 3.16.4 Linux kernel with AUFS to leverage Docker.  Other than that there is pretty much nothing else to it.  The TCL image is about 27MB and boots up in about 5 seconds, making it very fast, which is nice to get going with quickly.  There are instructions on the boot2docker site for creating custom .iso’s if you are interested as well if you are’t interested in building your own customized TCL.

The final component is called boot2docker-cli, which is normally referred to as the management tool.  This tools does a lot of the magic to get your host talking to the VM with minimal interaction.  It is basically the glue or the duct tape that allows users to pass commands from a local shell in to the container and get Docker to do stuff.

Installation

It is pretty dead simple to get boot2docker set up and configured.  You can download everything in one shot from the links on their site http://boot2docker.io or you can install manually on OSX with brew and a few other tools.  I highly recommend the packaged installer, it is very straight forward and easy to follow and there is a good video depiction of the process on the boot2docker site.

If you choose to install everything with brew you can use the following commands as a reference.  Obviously it will be assumed that brew is already installed and set up on your OSX system.  The first step is to install boot2docker.

brew install boot2docker

You might need to install Virtualbox separately using this method, depending on whether or not you already have a good version of Virtualbox to use with boot2docker.

The following commands will assume you are starting from scratch and do not have VBox installed.

brew update
brew tap phinze/homebrew-cask
brew install brew-cask
brew cask install virtualbox

That is pretty much it for installation.

Usage

The boot2docker CLI is pretty straight forward to use.  There are a bunch of commands to help users interface with the boot2docker VM from the command line.  The most basic and simple usage to initialize and create a vanilla boot2docker VM can be done with the following command.

boot2docker init

This will pull down the correct image and get the environment set up.  Once the VM has been created (see the tricks sections for a bit of customization) you are ready to bring up the VM.

boot2docker start

This command will simply start up the boot2docker VM and run some behind the scenes  tasks to help make using the VM seamless.  Sometimes you will be asked to set ENV variables here, just go ahead and follow the instructions to add them.

There are a few other nice commands that help you interact with the boot2docker VM.  For example if you are having trouble communicating with the VM you can run the ip command to gather network information.

boot2docker ip

If the VM somehow gets shut off or you cannot access it you can check its status.

boot2docker status

Finally there is a nice help command that serves as a good guide for interacting with the VM in various ways.

boot2docker help

The commands listed in this section will for the most part cover 90% of interaction and usage of the boot2docker VM.   There is a little bit of advanced usage with the cli covered below in the tricks section.

Tricks

You can actually modify some of the default the behavior of your boot2docker VM by altering some of the underlying boot2docker configurations.  For example, boot2docker will look in $HOME/.boot2docker/profile for any custom settings you may have.  If you want to change any network settings, adjust memory or cpu or a number of settings, you would simply change the profile to reflect the updated changes.

You can also override the defaults when you create your boot2docker VM by passing some arguments in.  If you want to change the memory or disk size by default, you would run something like

boot2docker init --memory=4096 --disksize=60000

Notice the –disksize=60000.  Docker likes to take up a lot of disk space for some of its operations, so if you can afford to, I would very highly recommending that you adjust the default disk size for the VM to avoid any strange running out of disk issues.  Most Macbooks or Windows machines have plenty of extra resources and big disks so usually there isn’t a good reason to not leverage the extra horsepower for your VM.

Troubleshooting

One very useful command for gathering information about your boot2docker environment is the boot2docker config command.  This command will give you all the basic information about the running config.  This can be very valuable when you are trying to troubleshoot different types of errors.

If you are familiar with boot2docker already you might have noticed that it isn’t a perfect solution and there are some weird quirks and nuances.  For example, if you put your host machine to sleep while the boot2docker VM is still running and then attempt to run things in Docker you may get some quirky results or things just won’t work.  This is due to the time skew that occurs when you put the machine to sleep and wake it up again, you can check the github issue for details.  You can quickly check if the boot2docker VM is out of sync with this command.

date -u; boot2docker ssh date -u

If you notice that the times don’t match up then you know to update your time settings.  The best fix for now that I have found for now is to basically reset the time settings by wrapping the following commands in to a script.

#!/bin/sh
 
# Fix NTP/Time
boot2docker ssh -- sudo killall -9 ntpd
boot2docker ssh -- sudo ntpclient -s -h pool.ntp.org
boot2docker ssh -- sudo ntpd -p pool.ntp.org

For about 95% of the time skew issues you can simply run sudo ntpclient -s -h pool.ntp.org to take care of the issue.

Another interesting boot2docker oddity is that sometimes you will not be able to connect to the Docker daemon or will sometimes receive other strange errors.  Usually this indicates that the environment variables that get set by boot2docker have disappeared,  if you close your terminal window or something similar for example.  Both of the following errors indicate the issue.

dial unix /var/run/docker.sock: no such file or directory

or

Cannot connect to the Docker daemon. Is 'docker -d' running on this host?

The solution is to either add the ENV variables back in to the terminal session by hand or just as easily modify your bashrc config file to read the values in when the terminal loads up.  Here are the variables that need to be reset, or appended to your bashrc.

export DOCKER_CERT_PATH=/Users/jmreicha/.boot2docker/certs/boot2docker-vm
export DOCKER_TLS_VERIFY=1
export DOCKER_HOST=tcp://192.168.59.103:2376

Assuming your boot2docker VM has an address of 192.168.59.103 and a port of 2376 for communication.

Shared folders

This has been my biggest gripe so far with boot2docker as I’m sure it has been for others as well.  Mostly I am upset that vboxsf are horrible and in all fairness the boot2docker people have been awesome getting this far to get things working with vboxsf as of release 1.3.  Another caveat to note if you aren’t aware is that currently, if you pass volumes to docker with “-v”, the directory you share must be located within the “/Users” directory on OSX.  Obviously not a huge issue but something to be aware if you happen to have problems with volume sharing.

The main issue with vboxsf is that it does not do any sort of caching sort of caching so when you are attempting to share a large amount of small files (big git repo’s) or anything that is filesystem read heavy (grunt) performance becomes a factor.  I have been exploring different workarounds because of this limitation but have not found very many that I could convince our developers to use.  Others have had luck by creating a container that runs SMB or have been able to share a host directory in to the boot2docker vm with sshfs but I have not had great success with these options.  If anybody has these working please let me know I’d love to see how to get them working.

The best solution I have come up with so far is using vagrant with a customized version of boot2docker with NFS support enabled, which has very little “hacking” to get working which is nice.  And a good enough selling point for me is the speed increase by using NFS instead of vboxsf, it’s pretty staggering actually.

This is the project that I have been using https://vagrantcloud.com/yungsang/boxes/boot2docker.  Thanks to @yungsang for putting this project together.  Basically it uses a custom vagrant-box based off of the boot2docker iso to accomplish its folder sharing with the awesome customization that Vagrant provides.

To get this workaround to work, grab the vagrantfile from the link provided above and put that in to the location you would like to run Vagrant from.  The magic sauce in the volume sharing is in this line.

onfig.vm.synced_folder ".", "/vagrant", type: "nfs"

Which tells Vagrant to share your current directory in to the boot2docker VM in the /vagrant directory, using NFS.  I would suggest modifying the default CPU and memory as well if your machine is beefy enough.

v.cpus = 4
v.memory = 4096

After you make your adjustments, you just need to spin up the yungsang version of boot2docker and jump in to the VM.

vagrant up
vagrant ssh

From within the VM you can run your docker commands just like you normally would.  Ports get forwarded through to your local machine like magic and everybody is happy.