Analyzing cloud costs

September 24, 2014August 31, 2015 Josh Reichardt 19 Comments

Knowing about and controlling the costs of a cloud environment is not only good to know how to do as an admin/engineer, it also greatly helps others inside your organization. Knowing your environment and cost overhead also makes you (or your team) look better when you can pinpoint bottlenecks, as well as anomalies in your environment, and create solutions to mitigating costs or otherwise track cloud resource utilization. Plus, it can even get you some extra credit.

So with this in mind, I’d like to talk about a few strategies and tools I have been experimenting with to help road map and accurately model different costs and utilization for different workloads spread out accross an AWS environment.

ICE

The first tool I’d like to mention is ICE and is probably my favorite tool. It is a tool developed by Netflix and analyzes costs across your AWS infrastructure. It gives you nice graphs and advanced breakdowns of prices, including spot pricing vs on demand and many other permutations across your AWS infrastructure.

This is the best explanation I can find, pulled right from their github page:

The ability to trend usage patterns on a global scale, yet decompose them down to a region, availability zone, or service team provides incredible flexibility. Ice allows us to quantify our AWS footprint and to make educated decisions regarding reservation purchases and reallocation of resources.

It has a nice interface and some slick filtering, so breaking things down on a region by region level becomes easy, which is otherwise not the case for the other tools. This tool is also great for spotting trends and anomalies in your environment which can sometimes go undetected if not viewed in the correct context.

The downside is the overhead associated with getting this up and running bu there is a Chef cookbook that will pretty much do the installation for you, if you are comfortable with Chef. You will need to override some attributes but otherwise it is pretty straight forward. If you need assistance let me know and I’d be glad to walk you through getting it set up.

AWS Calculator

This is a handy tool to help ballpark and model various costs for AWS services. One disappointing discovery of this tool is that it doesn’t help model spot instance prices.

This is great for mocking out what the TCO of a server or group of servers might look like. It is also good for getting a general feel for what different server costs will be for a certain number of months and/or years.

Be sure to check this out to help stay current on the most recent news because AWS moves quickly with seemingly constant updates and have been dropping prices steadily over the past 3 years. Especially with the increased competition from Microsoft (Azure) and Google (Google Cloud), AWS seems to be constantly slashing prices and adding new improvement and features to their product.

AWS Billing and Cost Management

This one is pretty self explanatory. It is built right in to AWS and as such, it can be a very powerful tool that can easily be overlooked. It offers a variety of detailed information about costs and billing. It offers some nice graphs and charts for trend spotting and can be exported for analysis, which is also nice (even though I haven’t got that far yet).

The major downside (in my opinion) is that you can’t get the granular price breakdowns that are available with a tool like ICE. For example, there isn’t an easy way to find a price comparison breakdown for cost per region or other more detailed information.

Trusted Advisor

This tool is great and is free for basic usage. This offering from AWS is really nice for helping to find and optimize settings according to a number of good practice recommendations created by Amazon. Not only does it give you some really nice price breakdowns but it also reports things like security and performance which can be equally useful. Use this often to tighten up areas of your infrastructure and to optimize costs.

One down for this one is that to unlock all of the features and functionality you need to upgrade to the enterprise version which is obviously more expensive.

AWS ELK Billing

I just found out about this one but it looks like it might be a very nice solution, leveraging the Logstash + Kibana stack. I have written a post about getting started with the ELK stack so it shouldn’t be difficult at all to begin playing around with this solution if you are interested.

If you get this tool up and working I would love to hear about it.

Cost saving tips

I have compiled a list of simple yet powerful tips to help control costs in AWS. Ideally a combination of all of these tips would be used to help control costs.

Upgrade server and service instance generations as often as possible for automatic improved performance and reduced price. For example gen 1 to gen 3, m1.xxx -> m3.xxx.
Try to size servers correctly by keeping them busy. Servers that are running but aren’t doing anything are essentially wasting money. Either run them according to time of day or bump up the amount of utilization per box, either by downsizing the server or upping the workload.
On that note, size servers correctly according to workload. For example a workload that demands CPU cylces should not be deployed as a memory optimized server.
Adopt on demand instances and utilize them early on. On demand prices are significantly lower than standard prices. Just be careful because your on demand instances can disappear.
In the same ilk of on demand instances, use reserved instances. These instance types can significantly reduce prices, and have the advantage that they won’t disappear so long running servers and services benefit from this type of cost control.
Set up granular billing as early as possible. Create and optimize alerts based on expected usage for tighter control of costs. It’s better to start off knowing and controlling environment costs sooner than later.
Delete unused EBS volumes. Servers and volumes can come and go, but often times EBS volumes can become orphaned and essentially no good. Therefore it is a good idea to clean up unused EBS volumes whenever you can. Of course this process can and should be automated.

Conclusion

Managing cost and optimizing your cloud infrastructure really could be considered its own discipline in some regards. Environments can become complex quite quickly with instances, services and resources spinning up and down as well as dynamically growing up and down to accommodate workloads as well as ever evolving environments can lead to what some call “Cloud Sprawl”.

The combination of the tools and cost savings tips mentioned above can be a real lifesaver when you are looking to squeeze out the most bang for your buck out of your cloud environment. It can also lead to a much more solid understanding of all the moving pieces in your environment and can help determine exactly is going on at any given time, which is especially useful for DevOps admins and engineers.

If you have any other cool tips or tips for controlling AWS costs or other cloud environment costs let me know, I’ll be sure to add them here!

Test Kitchen Tricks

September 2, 2014August 31, 2015 Josh Reichardt 27 Comments

I have been working a lot with Chef and Test kitchen lately and thus have learned a few interesting tricks when running tests with these tools. Test Kitchen is one of my favorite tools when working with Chef configuration management because it is very easy to use and has a number of powerful features that make testing things in Chef simple and easy.

Test Kitchen itself sits on top of Vagrant and Virtualbox by default so to get started with the most basic usage example of Test Kitchen you will need to have Vagrant installed along with a few other items.

Vagrant (your distribution)
Virtualbox (again with your distribution)
Ruby 1.9 or higher + ruby gems

Then to install Test Kitchen.

gem install test-kitchen

That’s pretty much it. The official docs have some pretty detailed usage and in fact I have learned many of the tricks that I will be writing about today from the docs.

Once you are comfortable with Test Kitchen you can begin leveraging some of the more powerful features, which is what the remainder of this post will cover. There is a great talk given by the creator of test kitchen at this year’s Chefconf by the creator of Test Kitchen about some of the lessons learned and cool things that you can do with the tool. If you haven’t already seen it, it is worth a watch.

Anyway, let’s get started.

1) Fuzzy matching

This one is great for the lazy people out there. It basically allows you match a certain unique part of a command instead of typing out an entire command. So for example, you can just type in a partial name for a command to return the desired full command. Since Test Kitchen uses regular expression matching, this can be a very powerful feature.

2) Custom drivers

One reason that Test Kitchen is so flexible is because it can leverage many different plugins and drivers. And, since it is open source, if there is functionality missing from a driver you can simply write your own. Currently there is an awesome list of drivers available for Test Kitchen to use, and a wide variety of options available to hopefully suit most testing scenarios.

Of course, there are others as well. These just happen to be the drivers that I have tried and can verify. There is even support for alternate configuration management tool testing, which can be handy for those that are not using Chef specifically. For example there is a salt driver available.

3) .kitchen.local.yml

This is a nice handy little bit that is often overlooked but allows a nice amount of control by overriding the default .kitchen.yml configuration file with specific options. So for example, if you are using the ec2 driver in your configuration but need to test locally with Vagrant you can simply drop a .kitchen.local.yml on your dev machine and override the driver (and any other settings you might need to change). I have created the following .kitchen.local.yml for testing on a local Vagrant box using 32 bit Ubuntu to highlight the override capabilities of Test Kitchen.

driver: 
 name: vagrant 
 
platforms: 
 - name: ubuntu-1310-i386 
 - name: ubuntu-1404-i386

4) Kitchen diagnose

An awesome tool for diagnosing issues with Test Kitchen. Running the diagnose will give you lots of juicy info about what your test machines are doing (or should be doing) and a ton of configuration information about them. Basically, if something is misbehaving this is the first place you should look for clues.

If you want to blast info and settings for all your configurations, just run the following,

kitchen diagnose

5) Concurrency

If you have a large number of systems that need to have tests run on them then running your Test Kitchen tests in parallel is a great way to speed up your total testing time. Turning on concurrency is pretty straight forward, just add the “-c” flag and the number of instances to run on (the default is 9999).

kitchen converge -c 5

6) Verbose logging

This one can be helpful if your kitchen run is failing with no real clues or helpful information provided by the diagnose command. It seems obvious but getting this one to work gave me some trouble initially. To turn on verbosity simply add the debug flag to your test kitchen command.

So for example, if you want to converge a node with verbosity turned on, you would use this command.

kitchen converge -l debug

I recommend taking a look at some or all of these tricks to help improve your integration testing with Test Kitchen. Of course as I stated, all of this is pretty well documented. Even if you are already familiar with this tool, sometimes it just helps to have a refresher to remind you of a great tool and to jar your memory. Let me know if you have any other handy tricks and I will be sure to post them here.

Transitioning from bash to zsh

August 16, 2014August 31, 2015 Josh Reichardt 1 Comment

I have know about zsh for a long time now but have never really had a compelling reason to switch my default shell from bash until just recently, I have been hearing more and more people talking about how powerful and awesome zsh is. So I thought I might as well take the dive and get started since that’s what all the cool kids seem to be doing these days. At first I thought that changing my shell was going to be a PITA with all the customizations and idiosyncrasies that I have grown accustomed to using bash but I didn’t find that to be the case at all when switching to zsh.

First and foremost, I used a tool called oh-my-zsh to help with the transition. If you haven’t heard about it yet, oh-my-zsh aims to be a sort of framework for zsh. This project is a nice clean way to get started with zsh because it give you a nice set of defaults out of the the box without having to do much configuration or tweaking and I found that many of my little tricks and shortcuts were already baked in to to oh-my-zsh, along with a ton of other settings and customizations that I did not have using bash.

From their github page:

oh-my-zsh is an open source, community-driven framework for managing your ZSH configuration. It comes bundled with a ton of helpful functions, helpers, plugins, themes, and few things that make you shout…

Here are just a few of the improvement that zsh/oh-my-zsh offer:

Improved tab completion
persistent history across all shells
Easy to use plugin system
Easy to use theme system
Autocorrect

The most obvious difference that I have noticed is the improved, out of the box tab completion, which I think should be enough on its own to convince you! On top of that, most of my tricks and customizations were already turned on with oh-my-zsh. Another nice touch is that themes and plugins come along as part of the package, which is really nice for easing the transition.

So after spending an afternoon with zsh I am convinced that it is the way to go (at least for my own workfolw). Of course there are always caveats and hiccups along the way as I’ve learned there are with pretty much everything.

Tuning up tmux

Out of the box, my tmux config uses the default shell, which happens to be bash. So I needed to modify my ~/.tmux.conf to reflect the switch over the zsh. It is a pretty straight forward change but is something that you will need to make note of kif you use tmux and are transitioning in to using zsh.

set-option -g default-shell /usr/bin/zsh

I am using Ubuntu 14.04, so my zsh is installed to /usr/bin/zsh. The other thing that you will need to do is make sure you kill any stale tmux processes after updating to zsh. I found one running in the background that was blocking me from using the new coonfig.

Goodies

There is a nice command cheat sheet for zsh. Take some time to learn these shortcuts and aliases, they are great time savers and are very usefull.

oh-my-zsh comes bundled up with a large number of goodies. At the time of this writing there were 135 plugins as well as a variety of themes. You can check the plugins wiki page for descriptions for the various plugins. To turn on a specific plugin you will need to add it to your ~/.zshrc config file. Find the following line in your config.

plugins=(git)

and add plugins separated by spaces

plugins (git vagrant chef)

You will need to reload the config for the changes to be picked up.

source ~/.zshrc

Most themes are hosted on the wiki, but there is also a web site dedicated to displaying the various themes, which is really cool. It does a much better job of showing differences between various themes. You can check it out here. Themes function in a similar way to plugins. If you want to change your theme, edit your ~/.zshrc file and select the desired them.

ZSH_THEME="clean"

You will need to reload your config for this option as well.

source ~/.zshrc

Conclusion

If you haven’t already made the switch to zsh I recommend that you at least experiment and play around with it before you make any final decisions. You may be set in your ways and happy with bash or any other shell that you are used to but for me, all the awesomeness changed my opinion and decide to reevaluate my biases. If you’re worried about making the switchin, using oh-my-zsh makes the transition so painless there is practically no reason not to try it out.

This post is really just the tip of the iceberg for the capabilities of this shell, I just wanted to expose readers to all of its glory. Zsh offers so much more power and customization than I have covered in this post and is an amazing productivity tool with little learning overhead.

Let me know if you have any awesome zsh tips or tweaks that folks should know about.

Uchiwa dashboard for Sensu

August 12, 2014August 31, 2015 Josh Reichardt 9 Comments

Recently the new Uchiwa dashboard redesign for Sensu was released, and it is awesome. It’s hard to describe how much of a leap forward this most recent release is, but it finally feels like Sensu is as “complete” and polished product as other open source and commercial products that exist. And if you haven’t heard of Sensu yet you are missing out. As described on the website sensuapp.org. Sensu is an open source monitoring framework. Instead of the traditional monolithic type of monitoring solutions (cough Nagios cough) that typically come to mind, the design of Sensu allows for a more more scalable and distributed approach to monitoring which hasn’t really been done before and offers a number of benefits, including a variety of dashboards to choose from.

Sensu touts itself as a “monitoring router”, which is a much more intuitive approach to monitoring once your wrap your head around the concept and leave the monolithic idea alone. For example, you can plug in different components to your monitoring solution very easily with Sensu, and you aren’t tied to one solution. If you need graphing and analytics you can choose from any number of existing solutions, Graphite, hosted Graphite, DataDog, NewRelic, etc. and more importantly, if something isn’t working as well as you’d like you can simply rip it out the component that isn’t working in favor of something that fits your needs better. Meaning it adds flexibility. no more hammering square blocks in to round holes. Sensu also offers nice scalability features, since all of the pieces are loosely coupled you don’t need to worry about scaling the entire beast, you can pick and choose which pieces to scale and when. Sensu itself is also scalable. Since the backbone of Sensu relies on RabbitMQ (soon to be opened up to other message queueing services), the busier it gets, simply cluster or add nodes to your RabbitMQ cluster. Granted, RabbitMQ isn’t exactly the easiest thing to scale, but it is possible.

With its distributed nature, Sensu by default is just a monitor. In the beginning, that meant either writing your own dashboard to communicate with Sensu server or using the default dashboard. As the ecosystem has evolved, the default dashboard has not been able to keep up with the evolution of Sensu and the needs of those using it.

Traditionally in the monitoring world, if you are not familiar, design and usability have not exactly been high priorities with regards to dashboards, graphics and GUI’s in the majority of tools that exist. Although that fact is changing somewhat with some of the newer cloud tools like DataDog and NewRelic, the only problem is that those solution are commercial and can become expensive. The bane of the open source solutions, at least for me, is how ugly the dashboards and user experiences have been (the Sensu default dashboard was an exception). But, the latest release of Uchiwa for Sensu has really changed the game in my opinion. It is much more modern and elegant.

We have gone from this:

To this:

Which one would you rather use? It is much easier to use and is much more elegant. The main dashboard (pictured above) gives a nice 1,000 ft view of what is going on in your environment. It is easy to quickly check the dashboard for any issues going on in your environment.

In addition to the home view, there is a nice checks view to get a glimpse of pretty much everything that’s going on in your environment. Sometimes with a large number of checks it is very easy to forget what exactly is happening so this is a nice way to double check.

As well, there is another similar view for checking clients. One small but very nice piece of info here is that it will display the Sensu client version for each host. If there are any issues with a host it is easy to tell from here.

You can also drill down in to any of these hosts to get a better picture of what exactly is going on. It will show you exactly which checks are being run for the host as well as some other very hand information.

From this page you can even select an individual check and see exactly how it is set up and behaving. It is easy to silence a single alert of all alerts for a client. Just click on the sound icon in any context to silence or unsilence an alert or an entire client. This has been handy for minimizing alert spam when doing maintenance on specific hosts.

One last handy feature is the info page. From here you can check out some of the Sensu server info as well as Uchiwa settings. This is also good for troubleshooting.

That pretty much covers the highlights of the new UI. As I have said, I am very excited for this release because this is an awesome GUI and there are going to be some really interesting improvements and additions in the future for Uchiwa which will make it an even stronger and more compelling reason to make the switch to Sensu and Uchiwa if you haven’t already.

If you have direct questions about the post, you comment here. Otherwise, the best place to get help with most of this stuff is probably the #sensu channel on IRC. That’s where the majority of the project contributors hang out. You can check out the Uchiwa code as well if you’d like over on Github. If you ever have issues with the dashboard that is the place to go, I would suggest browsing through the issues and if you can’t find a solution then create a new issue. Don’t hesitate to jump in to any of the discussions either. The author is very friendly and helpful and is very quick to respond to issues. One final helpful resource is the Sensu docs. Make sure you are looking at the correct version of Sensu according to the documentation, there are still enough changes occurring that the docs still have some differences between them and can get new users fumbled up.

7 useful but hard to remember Linux commands

August 9, 2014August 31, 2015 Josh Reichardt 22 Comments

I have found myself using these commands over and over so I decided I’d take the time to go ahead and document them for future me as well as readers because I find these commands pretty useful. I just always manage to forget them, hence the title of the post. The smart thing to do would be to create aliases for these commands but I have just been too lazy and some of them are run across different servers so it isn’t always a convenient option.

Anyway, let’s go ahead and run through the commands before I forget…

1) du -ah / | sort -n -r | head -n 50

This one is really handy for debugging space issues. It will list the top 50 files according to file size, with the largest at the top of the list.Notice the “/” will specify the location to search so you can easily modify this one to search different locations, like “/var/log” for example if you are having trouble with growing log files.

1.5) du -sh /*

This will quickly give you an idea of how how your disk space has been allocated. Definitely handy when you are troubleshooting.

2) git checkout — .

I don’t use this one very often, which is probably why I manage to forget it so easily. But I really like it. Sometimes I will be working on a git repo across different machines at the same time and will run in to conflicts committing to the repo or more likely I committed changes on one machine and just need to pull down the newest changes but can’t since I have made modifications. For those scenarios you can run the above command quickly reset your git changes quickly and easily.

3) tmux kill-window -t 3

i use tmux for my terminal and window manager on all my workstations and love it. If you haven’t heard of it, take a look here. Sometimes the sessions can get stuck so it becomes necessary to close the window without destroying the tmux session. Again, this doesn’t happen very often so it is sometimes hard for me to remember the exact syntax but this one is a handy little trick for managing tmux windows and sessions.

4) grep -r “text”

I know, I should really have this one memorized by now. I am trying to remember but I don’t find myself using this one all that often even though it is really powerful and useful. This will essentially search through every file recursively and spit out the text pattern that you feed to it.

5) kill $(pgrep process)

This one is handy when there are a large number of stuck processes and you need to blow them all out with one command. For example if the chrome browser ever gets stuck with a million tabs open, there are likely a large number of processes all with the same – or similar names. If you pass all or part of the process name in to this command pgrep will find them and kill will destroy them

6) docker rm $(docker ps -a -q)

I have been using Docker more and more recently and every once in awhile I find myself with a large number of dead Docker processes that need to be cleaned up. This command will blow out all of these stale processes at once. This is nice because Docker processes take up a large amount of disk space and often times can fill up your drives without you being aware. I have been able to reclaim large amounts of disk space with this command.

7) watch -n 10 df -ah

This is another good one for checking disk space issues. It will update you every ten seconds with the disk utilization of the system. Pretty straight forward but a great tool to help troubleshooting space issues.

That’s all I have for now, there are lots more but these are the most useful ones that I find myself forgetting the most often, hopefully this post will serve as a nice reminder. If you have any cool or useful commands that you would like to share feel free to comment and I will update the post to include them.