Recover a Grafana dashboard

Grafana uses Elasticsearch (optionally) to store its dashboards.  If you ever migrate your Graphite/Grafana servers or simply need to grab all of your dashboards from the old server then you will likely be looking for them in Elasticsearch.  Luckily, migrating to a new server and moving the dashboards is and uncomplicated and easy to do process.  In this post I will walk through the process of moving Grafana dashboards between servers.

This guide assumes that Elasticsearch has been installed on both old and new servers.  The first thing to look at is your current Grafana config.  This is the file that you probably used to set up your Grafana environment originally.  This file resides in the directory that you placed your Grafana server files in to, and is named config.js.  There is a block inside this config file that tells Elasticsearch where to save dashboards, which by default is called “grafana-dashboards” which should look something like this:

/**
 * Elasticsearch index for storing dashboards
 *
 */
 grafana_index: "grafana-dash-orig",

Now, if you still have access to the old server it is merely a matter of copying this Elasticsearch directory that houses your Grafana dashboard over to the new location. By default on an Ubuntu installation the Elasticsearch data files get placed in to the following path:

/var/lib/elasticsearch/elasticsearch/nodes/0/indices/grafana-dashboards

Replace “0” with the node if this is a clustered Elasticsearch instance, otherwise you should see the grafana-dashboard directory.  Simply copy this directory over to the new server with rsync or scp and put it in a temporary location for the time being (like /tmp for example).  Rename the existing grafana-dashboards directory to something different, in case there are some newly created dashboards that you would like to retrieve.  Then move the original dashboards (from the old server) from the /tmp directory in to the above path, renaming it to grafana-dashboard.  The final step is to chown the directory and its contents.  The steps for accomplishing this task are similar to the following.

On the old host:

cd /var/lib/elasticsearch/elasticsearch/nodes/0/indices/
rsync -avP -e ssh grafana-dashbaords/ user@remote_host:/tmp/

On the new host:

cd /var/lib/elasticsearch/elasticsearch/nodes/0/indices/
mv grafana-dashboards grafana-dash-orig
mv /tmp/grafana-dashboards ./grafana-dashboards
chown -R elasticsearch:elasticsearch grafana-dashboards

You don’t even need to restart the webserver or Elasticsearch for the old dashboards to show up.  Just reload the page and bam.   Dashboards.

grafana dashboard

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Cloud Backup Tutorial

I have been knee deep in backups for the past few weeks, but I think I can finally see light at the end of the tunnel.  What looked like a simple enough idea to implement turned out to be a much more complicated task to accomplish.  I don’t know why, but there seems to be practically no information at all out there covering this topic.  Maybe it’s just because backups suck?  Either way they are extremely important to the vitality of a company and without a workable set of data, you are screwed if something happens to your data.  So today I am going to write about managing cloud data and cloud backups and hopefully shine some light on this seemingly foreign topic.

Part of being a cloud based company means dealing with cloud based storage.  Some of the terms involved are slightly different than the standard backup and storage terminology.  Things like buckets, object based storage, S3, GCS, boto all come to mind when dealing with cloud based storage and backups.  It turns out that there are a handful of tools out there for dealing with our storage requirements which I will be discussing today.

The Google and Amazon API’s are nice because they allow for creating third party tools to manage the storage, outside of their official and standard tools.  In my journey to find a solution I ran across several, workable tools that I would like to mention.  The end goal of this project was to sync a massive amount of files and data from S3 storage to GCS.  I found that the following tools all provided at least some of my requirements and each has its own set of uses.  They are included here in no real order:

  • duplicity/duply – This tool works with S3 for small scale storage.
  • Rclone – This one looks very promising, supports S3 to GCS sync.
  • aws-cli – The official command line tool supported by AWS.

S3cmd – This was the first tool that came close to doing what I wanted.  It’s a really nice tool for smallish amounts of files and has some really nice and handy features and is capable of syncing S3 buckets.  It is equipped with a number of nice and handy options but unfortunately the way it is designed does not allow for reading and writing a large number of files.  It is a great tool for smaller sets of data.

s3s3mirror – This is an extremely fast copy tool written in Java and hosted on Github.  This thing is awesome at copying data quickly.  This tool was able to copy about 6 million files in a little over 5 hours the other day.  One extremely nice feature of this tool is that it has an intelligent sync built in so it knows which files have been copied over.  Even better, this tool is even faster when it is running reads only.  So once your initial sync has completed, additional syncs are blazing fast.

This is a jar file so you will need to have Java installed on your system to run it.

sudo apt-get install openjdk-jre-headless

Then you will need to grab the code from Github.

git clone git@github.com:cobbzilla/s3s3mirror.git

And to run it.

./s3s3mirror.sh first-bucket/ second-bucket/

That’s pretty much it.  There are some handy flags but this is the main command. There is an -r flag for changing the retry count, a -v flag for verbosity and troubleshooting as well as a –dry-run flag to see what will happen.

The only down side of this tool is that it only seems to be supported for S3 at this point – although the source is posted to Github so could easily be adapted to work for GCS, which is something I am actually looking at doing.

Gsutil – The Python command line tool that was created and developed by Google.  This is the most powerful tool that I have found so far.  It has a ton of command line options, the ability to communicate with other cloud providers, open source and is under active development and maintenance.  Gsutil is scriptable and has code for dealing with failures – it can retry failed copies as well as resumable transfers, and has intelligence for checking which files and directories already exist for scenarios where synchronizing buckets is important.

The first step to using gsutil after installation is to run through the configuration with the gsutil config command.  Follow the instructions to link gsutil with your account.  After the initial configuration has been run you can modify or update all the gsutil goodies by editing the config file – which lives in ~/.boto by default.  One config change that is worth mentioning is the parallel_process_count and parallel_thread_count.  These control how much data can get shoved through gsutil at once – so on really beefy boxes you can crank this number up quite a bit higher than its default.  To utilize the parallel processing you simply need to set the -m flag on your gsutil command.

gsutil -m sync/cp gs://bucket-name

One very nice feature of gsutil is that it has built in functionality to interact with AWS and S3 storage.  To enable  this functionality you need to copy your AWS access_id and your secret_access_key in to your ~/.boto config file.  After that, you can test out the updated config to look at your buckets that live on S3.

gsutil ls s3://

So your final command to sync an S3 bucket to Google Cloud would look similar to the following,

gsutil -m cp -R s3://bucket-name gs://bucket-name

Notice the -R flag, which sets the copy to be a recursive copy instead everything in one bucket to the other, instead of a single layer copy.

There is one final tool that I’d like to cover, which isn’t a command line tool but turns out to be incredibly useful for copying large sets of data from S3 in to GCS, which is the GCS Online Import tool.  Follow the link and go fill out the interest form listed and after a little while you should hear from somebody from Google about setting up and using your new account.  It is free to use and the support is very good. Once you have been approved for using this tool you will need to provide a little bit of information for setting up sync jobs, your AWS ID and key, as well as allowing your Google account to sync the data.  But it is all very straight forward and if you have any questions the support is excellent.  This tool saved me from having to manually sync my S3 storage to GCS manually, which would have taken at least 7 days (and that was even with a monster EC2 instance).

Ultimately, the tools you choose will depend on your specific requirements.  I ended up using a combination of s3s3mirror, AWS bucket versioning, the Google cloud import tool and gsutil.  But my requirements are probably different from the next person and each backup scenario is unique so a combination of these various tools allows for flexibility to accomplish pretty much all scenarios.  Let me know if you have any questions or know of some other tools that I have failed to mention here.  Cloud backups are an interesting and unique challenge that I am still mastering so I would love to hear any tips and tricks you may have.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Chef data bags with Test Kitchen

As a step towards integrating your Chef cookbooks with Jenkins CI and your testing/release pipeline it is important to make sure that local changes pass unit and integration tests before being accepted and committed into version control.  For example, when running test kitchen it is important to fully simulate what data bags and encrypted data bags are doing on a local box for many tests to pass correctly.  So, today I would like to focus on a stumbling block towards Jenkins and integration testing that I ran in to recently.  There are a few lessons that I learned along the way that I would like to share to help clarify things a little bit because there wasn’t much good info out there on how to do this.

First, I need to give credit where it is due.  This post was a great resource in my journey to find a solution to my test kitchen data bag issue.

The largest roadblock I found along the way was that the version of test kitchen I was using was being shipped with chef-solo as the primary driver.  There has been a lot of discussion around this topic lately and (from what I understand) has pretty much become the general consensus within the Chef community that chef-solo should be replaced by chef-zero.  There are a number of advantages to using chef-zero instead of chef-solo, including a lesson I learned the hard way, which is that chef-zero has the ability to act as a stand alone Chef server – unlocking the ability to store data bags and encrypted data bags without having to do any sort of wacky hacking to get Chef to compile and converge correctly.

There was a good post written recently that expounds more on the benefits of using chef-zero instead of chef-solo.  It is here, and is definitely worth the read if you are interested in learning more about the benefits of chef-zero.

So with that knowledge in mind, here is what a newly updated sample .kitchen.yml file might look like:

--- 
driver: 
 name: vagrant 
 
provisioner: 
 name: chef_zero 
 
platforms: 
 - name: ubuntu-13.10-i386 
 - name: centos-6.4-i386 
 
suites: 
 - name: default 
 data_bags_path: "test/integration/data_bags" 
 run_list: 
 - recipe[recipe-to-test] 
 attributes:

It’s a pretty straight forward config.  The biggest change that you will notice in this config is that instead of using chef-solo as the provisioner it has been changed to chef-zero – I now know that it makes all the difference in the world.  The next big change to observe is the data_bags_path in the suites section.  This bit of configuration basically tells the Chef provisioner to go look at the specified file path when chef-zero spins up and use that to store data bag, encrypted data bag or other information that potentially would live on the Chef server that client’s would use.

So in the test/integration/data_bags directory I have a directory and json file inside that directory for the specific data I am interested in, called sensu/ssl.json.  This file essentially contains the same information that is stored on the Chef server about the ssl certificates used for live hosts in the production environment, just mirrored into a sandbox/integration testing environment.

If you’re interested, here is a sample of what the  ssl.json file might look like:

{ 
 "id": "ssl", 
 "server": { 
 "key": "-----BEGIN RSA PRIVATE KEY-----gM
 "cert": "-----BEGIN CERTIFICATE-----gM
 "cacert": "-----BEGIN CERTIFICATE-----gM
 }, 
 "client": { 
 "key": "-----BEGIN RSA PRIVATE KEY-----gM
 "cert": "-----BEGIN CERTIFICATE-----gM
 } 
}

Note that the “id” is “ssl”.  As far as I know the file name must match up to the id when you are creating this json file.

Now you should be able to create and converge your test recipe with test kitchen:

kitchen create ubuntu
kitchen converge ubuntu

If you have any difficulty, let me know.  I tried to be thorough in this write up but could have accidentally skipped important information.  The main keys or takeaways though should be 1) use chef-zero wherever possible and 2) make sure you have your data bag paths and files created correctly and referenced correctly in your .kitchen.yml file.  Finally, if you are still having issues, make sure you have triple checked the spelling and json syntax of your paths and configs.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Review: Webmin Administrator’s Cookbook

webmin cookbookI just recently finished reading the Webmin Administrator’s Cookbook and thought I would share some of my thoughts and opinions about the book.  While I don’t typically review books on the blog I thought this would be a good opportunity to discuss a nice book.  This book is written by a very knowledgeable and credible author – Michal Karzynksi.  His background includes over a decade of experience as a developer in various programming languages as well as a scientific research background.

This book isa good read for everyone from seasoned veterans and professionals all the way down to aspiring and freshly minted admins.

The book itself covers a broad, inclusive set of topics, including logging, user management, backups, web server administration and many others.  The basic theme of the book uses the Webmin tool as a sort of framework to discuss and cover various administrative topics and tasks within the Webmin tool.  From their website, Webmin is described as follows:

Webmin is a web-based interface for system administration for Unix. Using any modern web browser, you can setup user accounts, Apache, DNS, file sharing and much more.

This works out to be a perfect tool for aspiring sysadmins because it really does a nice job of cloaking a lot of the nitty gritty complexity and detail that can be overwhelming and confusing for new admins or users that are new or unfamiliar to the concepts and tooling that Webmin covers.  By using Webmin, one can learn about a large number of interesting topics without having to worry about how to type in all of the commands or how to install/configure the tools that come bundled up in Webmin.  This allows users to really increase their productivity.  Couple the Webmin tool with a cookbook of nice concrete examples and you have a great recipe for learning how to use a powerful tool correctly.

Wrapping such a broad spectrum of topics and tools into a web based tool can be a complicated.  But used as a reference material this book does a great job of making everything clear with good examples both of explaining how everything works together, as well as pictorial examples that really do a nice job of tying the written concepts together with concrete, real world usage.  Now is also a good time to mention that this book follows a nice pattern of organizing topics.  From the outset, the book starts with the more basic administrative topics and principles, covering each topic thoroughly with good description and solid examples.  The book progresses quite nicely through the different topics and eventually gets into and covers some of the more obscure topics.

The Webmin Administrator’s Cookbook does a nice job of combining many complex system administration topics into a nice, easy to follow and read reference guide that can be utilized by all different levels of Linux and administrative experience.  If you use Webmin in any capacity at all, this book would be a great reference and guide to help you be more productive in your day to day with this tool.

You can find more information about the book here.  While you are at it, check out the author, Michal Karzynski’s blog for more interesting and useful tips - http://michal.karzynski.pl.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Leveraging Nagios Plugins with Chef and Sensu

Setting up Nagios plugins to run in a Sensu and Chef managed environment is straightforward and uncomplicated. For example, I recently have been interested in monitoring SSL certificate date expiration and it just so happens that the Nagios check_http plugin does exactly what I’m looking for.

The integration between Sensu and the Nagios plugins is very nice.  For convenience in our Sensu environment, we like to put the additional Nagios plugins on to all of the systems we monitor because the footprint is negligible and it allows for some nice flexibility of services and checks to monitor should an additional service get added to a server in the future that we hadn’t anticipated.  For the amount of effort it takes to get the checks onto the server and to get working, adding the Nagios plugins is totally worth the effort.

The first step is to add the Nagios plugins to your Chef recipe.  I am using a generic Chef recipe for my Sensu clients that takes care of some of the more tedious tasks including downloading the appropriate scripts and checks for the clients to run as well as some other dependencies and items that Sensu likes to have.  Luckily there is a public Debian package available for installing the Nagios plugins so it easy to add them.  Just add this snippet into your Chef recipe for Sensu clients:

apt_package "nagios-plugins" do 
 action :install 
end

After you run your next chef-client job you will have access to a variety of checks provided by the Nagios plugins package as illustrated below.

nagios checks

There are a number of examples available but to run the check_http for cert expiration by hand you can run this command:

/usr/lib/nagios/plugins/check_http -H <sitename> -C 30,10

Where <sitename> is the URL of the website you would like to check.  Now that we are able to run this check manually, go ahead and roll that in to your Chef recipe for Sensu.  An example of this might look similar to the following:

sensu_check "check_web" do 
  command "/usr/lib/nagios/plugins/check_http -H localhost -C 30,10" 
  handlers ["pagerduty", "slack"] 
  subscribers ["core"] 
  interval 60
  standalone true
  additional(:notification => "Certificate will expire soon", :occurrences => 5) 
end

You may not want to run this check on every host so it may be a good idea to run this check as a stand alone check.  It is simple enough to add this snippet in to any recipe and tack on the “standalone true” attribute to the sensu_check resource.  I have an example of what this standalone attribute looks like in the example above for reference.

Adding in Nagios plugins gives you a very nice set of additional tools to add to your monitoring arsenal for not that much effort.  You never know when something from the Nagios plugins might come in handy so I suggest you try them out.  There are many other uses for the Nagios plugins so I suggest taking a look.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.