Autosnap AWS snapshot and volume management tool

This is my first serious attempt at a Python tool on github.  I figured it was about time, as I’ve been leveraging Open Source tools for a long time, I might as well try to give a little bit back.  Please check out the project and leave feedback by emailing, opening a github or issue or commenting here, I’d love to see what can be done with this tool, there are lots of bugs to shake out and things to improve.  Even better if you have some code you’d like to contribute, this is very much a work in progress!

Here is the project – https://github.com/jmreicha/autosnap.

Introduction

Essentially, this tool is designed to ease the management of the snapshot and volume lifecycle in an AWS environment.  I have discovered that snapshots and volumes can be used together to form a simple backup management system, so by simplifying the management of these resources, by utilizing the power of the AWS API, you can easily manage backups of your AWS data.

While this obviously isn’t a full blown backup tool, it can do a few handy things like leverage tags to create and destroy backups based on custom expiration dates and create snapshots based on a few other criteria, all managed with tags.  Another cool thing about handling backups this way is that you get amazing resiliency by storing snapshots to S3, as well as dirt cheap storage.  Obviously if you have a huge number of servers and volumes your mileage will vary, but this solution should scale up in to the hundreds, if not thousands pretty easily.  The last big bonus is that you can nice granularity for backups.

For example, if you wanted to keep a weeks worth of backups across all your servers in a region, you would simply use this tool to set an expiration tag of 7 days and voila.  You will have rolling backups, based on snapshots for the previous seven days.  You can get the backup schedule fairly granular, because the snapshots are tagged down to the hour. It would be easy to get them down to the second if that is something people would find useful, I could see DB snapshots being important enough but for now it is set to the hour.

The one drawback is that this needs to be run on a daily basis so you would need to add it to a cron job or some other tool that runs tasks periodically.  Not a drawback really as much of a side note to be aware of.

Configuration

There is a tiny bit of overhead to get started, so I will show you how to get going.  You will need to either set up a config file or let autosnap build you one.  By default, autosnap will help create one the first time you run it, so you can use this command to build it:

autosnap

If you would like to provide your own config, create a file called ‘.config‘ in the base directory of this project.  Check the README on the github page for the config variables and for any clarifications you may need.

Usage

Use the –help flag to get a feeling for some of the functions of this tool.

$ autosnap --help

usage: autosnap [--config] [--list-vols] [--manage-vols] [--unmanage-vols]
 [--list-snaps] [--create-snaps] [--remove-snaps] [--dry-run]
 [--verbose] [--version] [--help]

optional arguments:
 --config          create or modify configuration file
 --list-vols       list managed volumes
 --manage-vols     manage all volumes
 --unmanage-vols   unmanage all volumes
 --list-snaps      list managed snapshots
 --create-snaps    create a snapshot if it is managed
 --remove-snaps    remove a snapshot if it is managed
 --version         show program's version number and exit
 --help            display this help and exit

The first thing you will need to do is let autosnap manage the volumes in a region:

autosnap --manage-vols

This command will simply add some tags to help with the management of the volumes.  Next, you can take a look and see what volumes got  picked up and are now being managed by autosnap

autosnap --list-vols

To take a snapshot of all the volumes that are being managed:

autosnap --create-snaps

And you can take a look at your snapshots:

autosnap --list-snaps

Just as easily you can remove snapshots older than the specified expiration date:

autosnap --remove-snaps

There are some other useful features and flags but the above commands are pretty much the meat and potatoes of how to use this tool.

Conclusion

I know this is not going to be super useful for everybody but it is definitely a nice tool to have if you work with AWS volumes and snapshots on a semi regular basis.  As I said, this can easily be improved so I’d love to hear what kinds of things to add or change to make this a great tool.  I hope to start working on some more interesting projects and tools in the near future, so stay tuned.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Patching CVE-6271 (shellshock) with Chef

If you haven’t heard the news yet, a recently disclosed vulnerability has been released that exploits environmental variables in bash.  This has some far reaching implications because bash is so widespread and runs on many different types of devices, for example network gear, routers, switches, firewalls, etc.  If that doesn’t scare you then you probably don’t need to finish reading this article.  For more information you can check out this article that helped to break the story.

I have been seeing a lot “OMG the world is on fire, patch patch patch!” posts and sentiment surrounding this recently disclosed vulnerability, but basically have not seen anybody taking the time to explain how to patch and fix this issue.  It is not a difficult fix but it might not be obvious to the more casual user or those who do not have a sysadmin or security background.

Debian/Ubuntu:

Use the following commands to search through your installed packages for the correct package release.  You can check the Ubuntu USN for versions.

dpkg -l | grep '^ii' or
dpkg-query --show bash

If you are on Ubuntu 12.04 you will need update to the following version:

bash    4.2-2ubuntu2.3

If you are on Ubuntu 13.10, and have this package (or below), you are vulnerable.  Update to 14.04!

 bash 4.2-5ubuntu3

If you are on Ubuntu 14.04, be sure to update to the most recently patched patch.

bash 4.3-7ubuntu1.3

Luckily, the update process is pretty straight forward.

apt-get update
apt-get --only-upgrade install bash

If you have the luxury of managing your environment with some sort of automation or configuration management tool (get this in place if you don’t have it already!) then this process can be managed quite efficiently.

It is easy to check if a server that is being managed by Chef has the vulnerability by using knife search:

knife search node 'bash_shellshock_vulnerable:true'

From here you could createa a recipe to patch the servers or fix each one by hand.  Another cool trick is that you can blast out the update to Debian based servers with the following command:

knife ssh 'platform_family:debian' 'sudo apt-get update; sudo apt-get install -y bash'; knife ssh 'platform_family:redhat' 'sudo yum -y install bash'

This will iterate over every server in your Chef server environment that is in the Debain family (including Ubuntu) or RHEL family (including CentOS) and update the server packages so that the latest patched bash version gets pulled down and then gets updated to the latest version.

You may need to tweak the syntax a little, -x to override the ssh user and -i to feed an identity file.  This is so much faster than manually installing the update on all your servers or even fiddling around with a tool like Fabric, which is still better than nothing.

One caveat to note:  If you are not on an LTS version of Ubuntu, you will need to upgrade your server(s) first to an LTS, either 12.04 or 14.04 to qualify for this patch.  Ubuntu 13.10 went out of support in August which was about a month ago as per the time of this writing so you will want to get your OS up date.

One more thing:  The early patches to address this vulnerability did not entirely fix the issue, so make sure that you have the correct patch installed.  If you patched right away there is a good chance you may still be vulnerable, so simply rerun your knife ssh command to reapply the newest patch, now that the dust is beginning to settle.

Outside of this vulnerability, it is a good idea to get your OS on an Ubuntu LTS version anyway to continue receiving critical updates for software as well as security patches for a longer duration than the normal, 6 month release cycle of the server distribution.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Analyzing cloud costs

Knowing about and controlling the costs of a cloud environment is not only good to know how to do as an admin/engineer, it also greatly helps others inside your organization.  Knowing your environment and cost overhead also makes you (or your team) look better when you can pinpoint bottlenecks, as well as anomalies in your environment, and create solutions to mitigating costs or otherwise track cloud resource utilization.  Plus, it can even get you some extra credit.

So with this in mind, I’d like to talk about a few strategies and tools I have been experimenting with to help road map and accurately model different costs and utilization for different workloads spread out accross an AWS environment.

ICE

The first tool I’d like to mention is ICE and is probably my favorite tool. It is a tool developed by Netflix and analyzes costs across your AWS infrastructure.  It gives you nice graphs and advanced breakdowns of prices, including spot pricing vs on demand and many other permutations across your AWS infrastructure.

This is the best explanation I can find, pulled right from their github page:

The ability to trend usage patterns on a global scale, yet decompose them down to a region, availability zone, or service team provides incredible flexibility. Ice allows us to quantify our AWS footprint and to make educated decisions regarding reservation purchases and reallocation of resources.

Amazon ICE

It has a nice interface and some slick filtering, so breaking things down on a region by region level becomes easy, which is otherwise not the case for the other tools.  This tool is also great for spotting trends and anomalies in your environment which can sometimes go undetected if not viewed in the correct context.

The downside is the overhead associated with getting this up and running bu there is a Chef cookbook that will pretty much do the installation for you, if you are comfortable with Chef.  You will need to override some attributes but otherwise it is pretty straight forward.  If you need assistance let me know and I’d be glad to walk you through getting it set up.

AWS Calculator

This is a handy tool to help ballpark and model various costs for AWS services.  One disappointing discovery of this tool is that it doesn’t help model spot instance prices.

AWS calculator

This is great for mocking out what the TCO of a server or group of servers might look like.  It is also good for getting a general feel for what different server costs will be for a certain number of months and/or years.

Be sure to check this out to help stay current on the most recent news because AWS moves quickly with seemingly constant updates and have been dropping prices steadily over the past 3 years.  Especially with the increased competition from Microsoft (Azure) and Google (Google Cloud), AWS seems to be constantly slashing prices and adding new improvement and features to their product.

AWS Billing and Cost Management

This one is pretty self explanatory.  It is built right in to AWS and as such, it can be a very powerful tool that can easily be overlooked.  It offers a variety of detailed information about costs and billing.  It offers some nice graphs and charts for trend spotting and can be exported for analysis, which is also nice (even though I haven’t got that far yet).

The major downside (in my opinion) is that you can’t get the granular price breakdowns that are available with a tool like ICE.  For example, there isn’t an easy way to find a price comparison breakdown for cost per region or other more detailed information.

Trusted Advisor

This tool is great and is free for basic usage.  This offering from AWS is really nice for helping to find and optimize settings according to a number of good practice recommendations created by Amazon.  Not only does it give you some really nice price breakdowns but it also reports things like security and performance which can be equally useful.  Use this often to tighten up areas of your infrastructure and to optimize costs.

One down for this one is that to unlock all of the features and functionality you need to upgrade to the enterprise version which is obviously more expensive.

AWS ELK Billing

I just found out about this one but it looks like it might be a very nice solution, leveraging the Logstash + Kibana stack.  I have written a post about getting started with the ELK stack so it shouldn’t be difficult at all to begin playing around with this solution if you are interested.

If you get this tool up and working I would love to hear about it.

Cost saving tips

I have compiled a list of simple yet powerful tips to help control costs in AWS.  Ideally a combination of all of these tips would be used to help control costs.

  • Upgrade server and service instance generations as often as possible for automatic improved performance and reduced price.  For example gen 1 to gen 3, m1.xxx -> m3.xxx.
  • Try to size servers correctly by keeping them busy.  Servers that are running but aren’t doing anything are essentially wasting money.  Either run them according to time of day or bump up the amount of utilization per box, either by downsizing the server or upping the workload.
  • On that note, size servers correctly according to workload.  For example a workload that demands CPU cylces should not be deployed as a memory optimized server.
  • Adopt on demand instances and utilize them early on.  On demand prices are significantly lower than standard prices.  Just be careful because your on demand instances can disappear.
  • In the same ilk of on demand instances, use reserved instances.  These instance types can significantly reduce prices, and have the advantage that they won’t disappear so long running servers and services benefit from this type of cost control.
  • Set up granular billing as early as possible.  Create and optimize alerts based on expected usage for tighter control of costs.  It’s better to start off knowing and controlling environment costs sooner than later.
  • Delete unused EBS volumes.  Servers and volumes can come and go, but often times EBS volumes can become orphaned and essentially no good.  Therefore it is a good idea to clean up unused EBS volumes whenever you can.  Of course this process can and should be automated.

Conclusion

Managing cost and optimizing your cloud infrastructure really could be considered its own discipline in some regards.  Environments can become complex quite quickly with instances, services and resources spinning up and down as well as dynamically growing up and down to accommodate workloads as well as ever evolving environments can lead to what some call “Cloud Sprawl”.

The combination of the tools and cost savings tips mentioned above can be a real lifesaver when you are looking to squeeze out the most bang for your buck out of your cloud environment.  It can also lead to a much more solid understanding of all the moving pieces in your environment and can help determine exactly is going on at any given time, which is especially useful for DevOps admins and engineers.

If you have any other cool tips or tips for controlling AWS costs or other cloud environment costs let me know, I’ll be sure to add them here!

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Fix Xbox Strict NAT on PFSense

Out of the box, it turns out that PFSense is not configured to handle some connection settings for Xbox Live.  Unfortunately I couldn’t find much of an explanation as to what this message actually means as far as degraded online performance but noticed that I would randomly get kicked out of games, get disconnected from XBox Live and have communication issues every once in a awhile so decided to take a look at what was actually going on because the mentioned issues started to get annoying.

I figured it should be easy enough to fix, but I couldn’t find a definite guide on how to fix this issue so I figured I would make sure it is clear for those who find this post and are having the same issue.  I tried a few different combinations, including port forward combinations mentioned in some forums, firewall rule changes, various UPnP settings, etc. but none of these combo’s worked and were unclear not very clear either.

Eventually I found this guide, which works and is great but doesn’t depict how to set everything up.  There are a few steps to get this working correctly so I will briefly describe them below.

Verify the IP address of you Xbox 360.  There is documentation around for finding it, but essentially go to system -> network -> advanced and it should give you the information.  You may want to set a static IP for your Xbox but I won’t cover that here.  Ask me if you have issues.

Now you will need to modify your firewall settings (Firewall -> NAT).  Choose the “Outbound” tab and change the mode to Manual Outbound NAT rule generation.  After you have saved the settings, create an entry for your Xbox and give it the address of your Xbox, with a mask of /32.

Firewall rule

Once this rule has been created, move it up to the top of the rule list.  You should have something similar to the following when done.

Firewall rules

Next, modify UPnP settings (Services -> UPnP & NAT-PMP) and select the following settings.

  • Enable UPnP & NAT-PMP
  • Allow UPnP port mapping
  • External Interface -> WAN
  • Interfaces -> LAN
  • User specified permissions 1- > allow 88-65535 192.168.39.17/32 88-65535

It should look something like this.

UPnP settings

Go ahead and save the settings and restart your Xbox (just turn off and on) to make sure the settings get picked up and that should be it.  I’m not entirely sure the user permissions need to be this wide open but it works so it is there for now.  I will update the post if I find any evidence that the settings need to be modified.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Test Kitchen Tricks

test kitchen

I have been working a lot with Chef and Test kitchen lately and thus have learned a few interesting tricks when running tests with these tools.  Test Kitchen is one of my favorite tools when working with Chef configuration management because it is very easy to use and has a number of powerful features that make testing things in Chef simple and easy.

Test Kitchen itself sits on top of Vagrant and Virtualbox by default so to get started with the most basic usage example of Test Kitchen you will need to have Vagrant installed along with a few other items.

Then to install Test Kitchen.

gem install test-kitchen

That’s pretty much it.  The official docs have some pretty detailed usage and in fact I have learned many of the tricks that I will be writing about today from the docs.

Once you are comfortable with Test Kitchen you can begin leveraging some of the more powerful features, which is what the remainder of this post will cover.  There is a great talk given by the creator of test kitchen at this year’s Chefconf by the creator of Test Kitchen about some of the lessons learned and cool things that you can do with the tool.  If you haven’t already seen it, it is worth a watch.

Anyway, let’s get started.

1) Fuzzy matching

This one is great for the lazy people out there.  It basically allows you match a certain unique part of a command instead of typing out an entire command.  So for example, you can just type in a partial name for a command to return the desired full command.  Since Test Kitchen uses regular expression matching, this can be a very powerful feature.

2) Custom drivers

One reason that Test Kitchen is so flexible is because it can leverage many different plugins and drivers.  And, since it is open source, if there is functionality missing from a driver you can simply write your own.  Currently there is an awesome list of drivers available for Test Kitchen to use, and a wide variety of options available to hopefully suit most testing scenarios.

Of course, there are others as well.  These just happen to be the drivers that I have tried and can verify.  There is even support for alternate configuration management tool testing, which can be handy for those that are not using Chef specifically.  For example there is a salt driver available.

3) .kitchen.local.yml

This is a nice handy little bit that is often overlooked but allows a nice amount of control by overriding the default .kitchen.yml configuration file with specific options.  So for example, if you are using the ec2 driver in your configuration but need to test locally with Vagrant you can simply drop a .kitchen.local.yml on your dev machine and override the driver (and any other settings you might need to change). I have created the following .kitchen.local.yml for testing on a local Vagrant box using 32 bit Ubuntu to highlight the override capabilities of Test Kitchen.

driver: 
 name: vagrant 
 
platforms: 
 - name: ubuntu-1310-i386 
 - name: ubuntu-1404-i386

4) Kitchen diagnose

An awesome tool for diagnosing issues with Test Kitchen.  Running the diagnose will give you lots of juicy info about what your test machines are doing (or should be doing) and a ton of configuration information about them.  Basically, if something is misbehaving this is the first place you should look for clues.

If you want to blast info and settings for all your configurations, just run the following,

kitchen diagnose

5) Concurrency

If you have a large number of systems that need to have tests run on them then running your Test Kitchen tests in parallel is a great way to speed up your total testing time.  Turning on concurrency is pretty straight forward, just add the “-c” flag and the number of instances to run on (the default is 9999).

kitchen converge -c 5

6) Verbose logging

This one can be helpful if your kitchen run is failing with no real clues or helpful information provided by the diagnose command.  It seems obvious but getting this one to work gave me some trouble initially.  To turn on verbosity simply add the debug flag to your test kitchen command.

So for example, if you want to converge a node with verbosity turned on, you would use this command.

kitchen converge -l debug

I recommend taking a look at some or all of these tricks to help improve your integration testing with Test Kitchen.  Of course as I stated, all of this is pretty well documented.  Even if you are already familiar with this tool, sometimes it just helps to have a refresher to remind you of a great tool and to jar your memory.  Let me know if you have any other handy tricks and I will be sure to post them here.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.