Category Archives: Linux

Test Kitchen style testing for Salt

If you are already familiar with Test Kitchen then a lot of this guide should be straight forward.  ChefDK has most of the needed tools bundled up for you already, I recommend installing ChefDK and then extending it to work with Salt.

In addition to the Test Kitchen install dependencies, you will need to install the following (additional) gems in order to get Test Kitchen working with Salt:

  • kitchen-vagrant
  • kitchen-salt

Then create a “.kitchen.yml” file in your /srv/salt directory.  This file tells Test Kitchen how to load in its configuration so it can test out your Salt configurations.

Here is a sample of what your .kitchen.yml file might look like.

---
driver:
  name: vagrant

provisioner:
  name: salt_solo
  is_file_root: True
  pillars-from-files:
    base.sls: /srv/pillar/base.sls
  pillars:
  top.sls:
  base:
    '*':
      - base

platforms:
  - name: ubuntu-14.04

suites:
  - name: default

There is a good reference that describes the various options in the kitchen-salt docs.

I had to play around with this config to get things working correctly so you may need to make your own adjustments.  The key components are described in the “provisioner” section.  “is_file_root” is important because it tells the minion where to look for its configuration, it essentially says look at the top.sls file on the server that runs Test Kitchen.

Use “pillars-from-files” to manually add in any custom pillar data you have.  I had issues getting the default configuration to automatically add in pillar data so used this approach as a workaround.

Another caveat to mention here is that in order to get this method working I had to break the best practice of storing external Salt formulas in /srv/formulas and instead copy them directly in to the “root” diretory of /srv/salt.  So basically all of the logic and formulas will live in this base location.  If this point isn’t clear let me know and I can post more details.

Vagrant style testing

The next best alternative I have found to using the Salt driver for Test Kitchen is manually spinning up a customized vagrant box to test communication with the salt master or alternatively connecting via salt-ssh to run.

This method is a great compliment if you aren’t interested in running Salt in local mode and instead learning about and testing the salt-master and/or salt-ssh.  This method is also straight forward.

Here is what the custom config looks like for Vagrant.

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure(2) do |config|

 # OS config
 config.vm.box = "ubuntu/trusty64"
 config.vm.hostname = "salt-minion"
 config.vm.network :private_network, ip: "192.168.33.10"

 # Copy Salt master files for masterless provisioning
 config.vm.synced_folder "/srv/salt/", "/srv/salt/"

 # Install/config Salt
 config.vm.provision :salt do |salt|
 salt.minion_config = "/etc/salt/minion"
 salt.run_highstate = false
 install_type = "daily"
 colorize = true

 # For remote master preseeding
 salt.minion_key = "salt-minion.pem"
 salt.minion_pub = "salt-minion.pub"

 # Debugging
 #salt.bootstrap_options = "-D"
 #salt.verbose = true

 end

 # Additional configuration
 config.vm.provision "shell", inline: "echo '192.168.1.170 salt' >> /etc/hosts"
 config.vm.provision "shell", inline: "apt-get install salt-ssh"

end

This config will do a few different things:

  • Configure a static address to make some testing easier
  • Dummy a host entry for your salt master
  • Bootstrap the salt installation
  • Copy over a centrally managed minion file (if you want to customize how the minion behaves)
  • Install salt-ssh if you want to play around with ssh functionality

Note:  To use salt-ssh you will need to create and entry in /etc/salt/roster for the Vagrant machine and set up credentials to connect.  All of the configuration options can be found in the Vagrant docs.  Obviously much more can be done in Vagrant but you will have to test all of the various options yourself to see what suits your needs.

To check current Salt keys, run the following commands on the master.  This should not return anything yet since we haven’t created the keys.

sudo salt-key -L

So with this configuration we are generating a key once and reusing it so we only need to accept the key once from the Salt master.  To generate the keys needed run the following command from the root vagrant directory.

sudo salt-key --gen-keys=salt-minion

Then to add the new entry on the Master (after bringing up the Vagrant box!):

sudo salt-key -a 'salt-minion'

Once this set of keys has been accepted, we can bring the minion VM up and down without having to worry about adding and deleting keys every time you need to test something.  Obviously this approach should not be taken outside of testing environments in to production.

Lastly, use this command to delete an old minion:

sudo salt-key -d salt-minion

Conclusion

Being new to Salt I found the combination of using the custom Vagrant box coupled with the Test Kitchen provisioner a great way to learn and also how to test Salt configurations.  The best part about using this method is that there is no additional work to getting the two methods to work together.  For example, after you have your directory structure set up correctly on the host system (master confg) then you will already have everything ready to go for the Test Kitchen as well as the Vagrant box method of testing.

I have found the combination to be very useful in my own learning so far of Salt.  Obviously this wont’ address all of the complexity of a deployment but is a great and easy way to get introduced to many of the concepts and ideas of Salt.

I am really enjoying Salt so far and I hope that readers can put some of my findings to help with their learning as well.

Quicktip: Manage Memory Usage with Supervisord

I have been using Supervisord for process management for quite a while now but had no idea it could manage memory usage (among other things) until just recently.

There is a Python project called Superlance which essentially adds some extra functionality to supervisord for managing processes and memory.  The docs are a little thin so I thought it would be a good idea to highlight some of the functionality for folks that just want a few examples of how it works or can be used in a useful way.

Obviously you will want to have supervisor installed and configured already.  That can be done with pip or via apt-get.  You will also need to make sure you have a proper [unix_http_server] section in your /etc/supervisor/supervisord.conf file.

To install Superlance (on Ubuntu 14.04).

sudo pip install superlance

This will download and install a handful of Python scripts that can then be plugged in to Supervisor.  Check the link above if you are interested in the other plugins.

Then you will need to add a section to your supervisor config for memmon to manage memory usgae.

[eventlistener:memmon]
command=memmon -p <program_name>=3GB
events=TICK_60

The “-p <program_name>” corresponds to the program header in your supervisor configuration.  There are other options available to manage group processes, etc. for more advanced use cases but this should cover most basic scenarios.

You will need to reload the supervisor configuration after your changes have been made.  Unforunately the supervisor process needs to be fully reloaded.

sudo supervisorctl reload

If you want to check that the the memmon script is available before restarting supervisor you can use reread.

sudo supervisorctl reread

I would suggest reading through the Superlance docs and checking out the other scripts.  This additional functionality really helps add another layer of functionality to supervisord that I didn’t know existed.

Change CoreOS default toolbox

This is a little trick that allows you to override the default base OS in the CoreOS “toolbox“.  The toolbox is a neat trick to allow you to debug and troubleshoot issues inside containers on CoreOS without having to do any outside work of setting up a container.

The default toolbox OS defaults to Fedora, which we’re going to change to Ubuntu.  There is a custom configuration file that will get read in via the .toolboxrc file, located at /home/core/.toolboxrc by default.  To keep things simple we will only be changing the few pieces of the config to get the toolbox to behave how we want.  More can be changed but we don’t really need to override anything else.

TOOLBOX_DOCKER_IMAGE=ubuntu
TOOLBOX_DOCKER_TAG=14.04

That’s pretty cool, but what if we want to have this config file be in place for all servers?  We don’t want to have to manually write this config file for every server we log in to.

To fix this issue we will add a simple configuration in to the user-data file that gets fed in to the CoreOS cloud-config when the server is created.  You can find more information about the CoreOS cloud-configs here.

The bit in the cloud config that needs to change is the following.

-write_files:
  - path: /home/core/.toolboxrc
    owner: core
    content: |
      TOOLBOX_DOCKER_IMAGE=ubuntu
      TOOLBOX_DOCKER_TAG=14.04

If you are already using cloud-config then this change should be easy, just add the bit starting with -path to your existing -write_files section.  New servers using this config will have the desired toolbox defaults.

This approach gives us an automated, reproducible way to clone our custom toolbox config to every server that uses cloud-config to bootstrap itself.  Once the config is in place simply run the “toolbox” command and it should use the custom values to pull the desired Ubuntu image.

Then you can run your Ubuntu commands and debugging tools from within the toolbox.  Everything else will be the same, we just use Ubuntu now as our default toolbox OS.  Here is the post that gave me the idea to do this originally.

DevOps Conferences

hello

I did a post quite awhile ago that highlighted some of the cooler system admin and operations oriented conferences that I had on my radar at that time.  Since then I have changed jobs and am now currently in a DevOps oriented position, so I’d like to revisit the subject and update that list to reflect some of the cool conferences that are in the DevOps space.

I’d like to start off by saying first that even if you can’t make it to the bigger conferences, local groups and meet ups are also an excellent way to get out and meet other professionals that do what you do. Local groups are also an excellent way to stay in the loop on what’s current and also learn about what others are doing.  If you are interested in eventually becoming a presenter or speaker, local meet ups and groups can be a great way to get started.  There are numerous opportunities and communities (especially in bigger cities), check here for information or to see if there is a DevOps meet up near you.  If there is nothing near by, start one!  If you can’t find any DevOps groups look for Linux groups or developer groups and network from there, DevOps is beginning to become popular in broader circles.

After you get your feet wet with meet ups, the next place to start looking is conferences that sound like they might be interesting to you.  There are about a million different opportunities to choose from, from security conferences, developer conferences, server and network conferences, all the way down the line.  I am sticking with strictly DevOps related conferences because that is currently what I am interested and know the best.

Feel free to comment if I missed any conferences that you think should be on this list.

DevOps Days (Multiple dates)

Perhaps the most DevOps centric of all the conference list.  These conferences are a great way to meet with fellow DevOps professionals and network with them.  The space and industry is changing constantly and being on top of all of the changes is crucial to being successful.  Another nice thing about the DevOps days is that they are spread out around the country (and world) and spread out throughout the year so they are very accessible.  WARNING:  DevOps days are not tied to any one set of DevOps tools but rather the principles and techniques and how to apply them to different environments.  If you are looking for super in depth technical talks, this one may not be for you.

ChefConf (March)

The main Chef conference.  There are large conferences for the main configuration management tools but I chose to highlight Chef because that’s what we use at my job.  There are lots of good talks that have a Chef centered theme but also are great because the practices can be applied with other tools.  For example, there are many DevOps themes at ChefConf including continuous integration and deployment topics, how to scale environments, tying different tools together and just general configuration management techniques.  Highly recommend for Chef users, feel free to substitute the other big configuration management tool conferences here if Chef isn’t your cup of tea (Salt, Puppet, Ansible).

CoreOS Fest (May)

  • 2015 videos haven’t been posted yet

Admittedly, this is a much smaller and niche conference but is still awesome.  The conference is the first one put on by the folks at CoreOS and was designed to help the community keep up with what is going on in the CoreOS and container world.  The venue is pretty small but the content at this years conference was very good.  There were some epic announcements and talks at this years conference, including Tectonic announcements and Kubernetes deep dives, so if container technology is something you’re interested in then this conference would definitely be worth checking out.

Velocity (May)

This one just popped up on my DevOps conference radar.  I have been hearing good things about this conference for awhile now but have not had the opportunity to go to it.  It always has interesting speakers and topics and a number of the DevOps thought leaders show up for this event.  One cool thing about this conference is that there are a variety of different topics at any one time so it offers a nice, wide spectrum of information.  For example, there are technical tracks covering different areas of DevOps.

DockerCon (June)

Docker has been growing at a crazy pace so this seems like the big conference to go check out if you are in the container space.  This conference is similar to CoreOS fest but focuses more heavily on topics of Docker (obviously).  I haven’t had a chance to go to one of these yet but containers and Docker have so much momentum it is very difficult to avoid.  As well, many people believe that container technologies are going to be the path to the future so it is a good idea to be as close to the action as you can.

Monitorama (June)

This is one of the coolest conferences I think, but that is probably just because I am so obsessed with monitoring and metrics collection.  Monitoring seems to be one of those topics that isn’t always fun to deal with or work around but talks and technologies at this conference actually make me excited about monitoring.  To most, monitoring is a necessary evil and a lot of the content from this conference can help make your life easier and better in all aspects of monitoring, from new trends and tools to topics on how to correctly monitor and scale infrastructures.  Talks can be technical but well worth it, if monitoring is something that interests you.

AWS Re:Invent (November)

This one is a monster.  This is the big conference that AWS puts on every year to announce new products and technologies that they have been working on as well as provide some incredibly helpful technical talks.  I believe this conference is one of the pricier and more exclusive conferences but offers a lot in the way of content and details.  This conference offers some of the best, most technical topics of discussion that I have seen and has been invaluable as a learning resource.  All of the videos from the conference are posted on YouTube so you can get access to this information for free.  Obviously the content is related to AWS but I have found this to be a great way to learn.

Conclusion

Even if you don’t have a lot of time to travel or get out to these conferences, nearly all of them post video from the event so you can watch it whenever you want to.  This is an INCREDIBLE learning tool and resource that is FREE.  The only downside to the videos is that you can’t ask any questions, but it is easy to find the presenters contact info if you are interested and feel like reaching out.

That being said, you tend to get a lot more out of attending the conference.  The main benefit of going to conferences over watching the videos alone is that you get to meet and talk to others in the space and get a feel for what everybody else is doing as well as check out many cool tools that you might otherwise never hear about.  At every conference I attend, I always learn about some new tech that others are using that I have never heard of that is incredibly useful and I always run in to interesting people that I would otherwise not have the opportunity to meet.

So definitely if you can, get out to these conferences, meet and talk to people, and get as much out of them as you can.  If you can’t make it, check out the videos afterwards for some really great nuggets of information, they are a great way to keep your skills sharp and current.

If you have any more conferences to add to this list I would be happy to update it!  I am always looking for new conferences and DevOps related events.

Composing a Graphite server with Docker

Grafana dashboard

Recently our Graphite server needed to be overhauled, which I was not looking forward to.  Luckily Docker makes the process of building identical and reproducible images for configuring a new server much easier and painless than other methods.

Introduction

If you don’t know what Graphite is you can check out the documentation for more info.  Basically it is a tool to collect and aggregate metrics of pretty much any kind, in to a central location.  It is a great complement to something like statsd for metric collection and aggregation, which I will go over later.

The setup I will be describing today leverages a handful of components to work.  The first and most important part is Graphite.  This includes all of the parts that make up Graphite, including the carbon aggregator and carbon cache for the collection and processing of metrics as well as the whisper db for storing metrics.

There are several other alternative backends but I don’t have any experience with them so won’t be posting any details.  If you are interested, InfluxDB and OpenTSDB both look like interesting alternative backends to whisper for storing metrics.

The Problem

Graphite is known to be notoriously difficult to install and configure properly.  If you haven’t tried to set up Graphite before, give it a try.

Another argument that I hear quite a bit is that the Graphite workload doesn’t really fit in with the Docker model.  In a distributed or highly available architecture that might be the case but in the example I cove here, we are taking a different approach.

The design and implementation separates data on to an EBS volume which is a durable storage resource, so it doesn’t matter if the server were to have problems.  With our approach and process we can reprovision the server and have everything up and running in less than 5 minutes.

The benefit of doing it this way is obvious.  Another benefit of our approach is that we are levering the graphite-api package so that we can have access to all of the Graphite goodness without having to run all of the other bloats and then proxying it through ngingx/wsgi which helps with performance.  I will go over this set up in a little bit.  No Graphite server would be complete if it didn’t leverage Grafana, which turns out to be stupidly easy using the Docker approach.

If we were ever to try to expand this architecture I think a distributed model using EFS (currently in preview) along with some type of load balancer in front to distribute requests evenly may be a possibility.  If you have experience running Graphite across many nodes I would love to hear what you are doing.

The Solution

There are a few components to our architecture.  The first is a tool I have been writing about recently called Terraform.  We use this with some custom scripting to build the server, configure it and attach our Graphite data volume to the server.

Here is what a sample terraform config might look like to provision the server with the tools we want.  This server is provisioned to an AWS environment and leverages a number of variables.  You can check the docs on how variables work or if there is too much confusion I can post an example.

provider "aws" {
  access_key = "${var.access_key}"
  secret_key = "${var.secret_key}"
  region = "${var.region}"
}

resource "aws_instance" "graphite" {
  ami = "${lookup(var.amis, var.region)}"
  availability_zone = "us-east-1e"
  instance_type = "c3.xlarge"
  subnet_id = "${var.public-1e}"
  security_groups = ["${var.graphite}"]
  key_name = "XXX"
  user_data = "${file("../cloud-config/graphite.yml")}"

  root_block_device = {
    volume_type = "gp2"
    volume_size = "20"
  }

  connection {
    user = "username"
    key_file = "${var.key_path}"
  }

 # mount EBS
  provisioner "local-exec" {
     command = "aws ec2 attach-volume --region=us-east-1 --volume-id=${var.graphite_data_vol} --instance-id=${aws_instance.graphite.id} --device=/dev/xvdf"
  }

  provisioner "remote-exec" {
    inline = [
    "while [ ! -e /dev/xvdf ]; do sleep 1; done",
    "echo '/dev/xvdf /data ext4 defaults 0 0' | sudo tee -a /etc/fstab",
    "sudo mkdir /data && sudo mount -t ext4 /dev/xvdf /data"
  ]
 }

}

And optionally if you have an Elastic IP to use you can tack that on to your config

resource "aws_eip" "graphite" {
  instance = "${aws_instance.graphite.id}"
  vpc = true
}

The graphite server uses a mostly standard config and installs a few of the components that we need to run the server, docker, python, pip, docker-compose, etc.  Here is what a sample cloud config for the Graphite server might look like.

#cloud-config

# Make sure OS is up to date
apt_update: true
apt_upgrade: true
disable_root: true

# Connect to private repo
write_files:
 - path: /home/<user>/.dockercfg
 owner: user:group
 permissions: 0755
 content: |
 {
   "https://index.docker.io/v1/": {
   "auth": "XXX",
   "email": "email"
 }
 }

# Capture all subprocess output for troubleshooting cloud-init issues
output: {all: '| tee -a /var/log/cloud-init-output.log'}

packages:
 - python-dev
 - python-pip

# Install latest Docker version
runcmd:
 - apt-get -y install linux-image-extra-$(uname -r)
 - curl -sSL https://get.docker.com/ubuntu/ | sudo sh
 - usermod -a -G docker <user>
 - sg docker
 - sudo pip install -U docker-compose

# Reboot for changes to take
power_state:
 mode: reboot
 delay: "+1"

ssh_authorized_keys:
 - <put your ssh public key here>

Docker

This is where most of the magi happens.  As noted above, we are using Docker and a few of its tools to get everything working.  All the logic to get Graphite running is contained in the Dockerfile, which will require some customizing but is similar to the following.

# Building from Ubuntu base
FROM ubuntu:14.04.2

# This suppresses a bunch of annoying warnings from debconf
ENV DEBIAN_FRONTEND noninteractive

# Install all system dependencies
RUN \
 apt-get -qq install -y software-properties-common && \
 add-apt-repository -y ppa:chris-lea/node.js && \
 apt-get -qq update -y && \
 apt-get -qq install -y build-essential curl \
 # Graphite dependencies
 python-dev libcairo2-dev libffi-dev python-pip \
 # Supervisor
 supervisor \
 # nginx + uWSGI
 nginx uwsgi-plugin-python \
 # StatsD
 nodejs

# Install StatsD
RUN \
 mkdir -p /opt && \
 cd /opt && \
 curl -sLo statsd.tar.gz https://github.com/etsy/statsd/archive/v0.7.2.tar.gz && \
 tar -xzf statsd.tar.gz && \
 mv statsd-0.7.2 statsd

# Install Python packages for Graphite
RUN pip install graphite-api[sentry] whisper carbon

# Optional install graphite-api caching
# http://graphite-api.readthedocs.org/en/latest/installation.html#extra-dependencies
# RUN pip install -y graphite-api[cache]

# Configuration
# Graphite configs
ADD carbon.conf /opt/graphite/conf/carbon.conf
ADD storage-schemas.conf /opt/graphite/conf/storage-schemas.conf
ADD storage-aggregation.conf /opt/graphite/conf/storage-aggregation.conf
# Supervisord
ADD supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# StatsD
ADD statsd_config.js /etc/statsd/config.js
# Graphite API
ADD graphite-api.yaml /etc/graphite-api.yaml
# uwsgi
ADD uwsgi.conf /etc/uwsgi.conf
# nginx
ADD nginx.conf /etc/nginx/nginx.conf
ADD basic_auth /etc/nginx/basic_auth

# nginx
EXPOSE 80 \
# graphite-api
8080 \
# Carbon line receiver
2003 \
# Carbon pickle receiver
2004 \
# Carbon cache query
7002 \
# StatsD UDP
8125 \
# StatsD Admin
8126

# Launch stack
CMD ["/usr/bin/supervisord", "-c", "/etc/supervisor/supervisord.conf"]

The other component we need is Grafana, which we don’t actually build but pull from the Dockerhub registry and inject our custom volume to.  This is all captured in our docker-compose.yml file listed below.

graphite:
  build: ./docker-graphite
  restart: always
  ports:
    - "8080:80"
    - "8125:8125/udp"
    - "8126:8126"
    - "2003:2003"
    - "2004:2004"
  volumes:
    - "/data/graphite:/opt/graphite/storage/whisper"

grafana:
  image: grafana/grafana
  restart: always
  ports:
    - "80:3000"
  volumes:
    - "/data/grafana:/var/lib/grafana"
  links:
    - graphite
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=password123

We have open sourced our configuration and placed it on github so you can take a look at it to get a better idea of the configs and how everything is working with some working examples.  The github repo is a quick way to try out the stack without having to provision and build an environment to run this on.  If you are just interested in kicking the tires I suggest starting with the github repo.

The build directive above corresponds to the repo on github.

The last components is actually running the Docker containers.  As you can see we use docker-compose but we also need a way to start the containers automatically after a disruption like a reboot or something.  That is actually pretty easy.  On an Ubuntu (or system using upstart) you can create an init script to start up docker-compose or restart it automatically if it has problems.  Here I have created a file called /etc/init/graphite.conf with the following configuraiton.

description "Graphite"
start on filesystem and started docker
stop on runlevel [!2345]
respawn
chdir /home/user
exec docker-compose up

A systemd service would achieve a similar goal but the version of Ubuntu used here doesn’t leverage systemd.

After everything has been dropped in place and configured you can check your work by testing out Grafana by hitting the public IP address of your server.  If you hit the Grafana splash page everything should be working!

Grafana dashboard

 

 

 

 

 

 

 

 

 

 

Conclusion

There are many pieces to this puzzle and honestly we don’t have the requirement of having Graphite be 100% available and redundant so we can get away with a single server for our needs.  A separate EBS volume and Terraform allow us to rebuild the server quickly and automatically if something were to happen to the server.  Also, the way we have designed Graphite to run will be able to handle a substantial workload without falling over.  But if you are doing anything cool with Graphite HA or resiliency I would like to hear how you are doing it, there is always room for improvement.

If you are just interested in trying out the Graphite stack I highly suggest going over to the github repo and running the container stack to play around with the components, especially if you are interested in learning about how statsd and graphite collect metrics.  The Grafana interface give you a nice way to tap in to the metrics that get pumped in to Graphite.