Setting up a private git repo in Chef

It turns out that cloning and managing private git repo’s in Chef is not as easy as it looks.  That said, I have a method that works.  I have no idea if this is the preferred method or if there are any easier ways but this worked for me, so let me know if there is an easier way and I will be glad to update this post.

First, I’d like to give credit where it is due.  I used this post as a template as well as the SSH wrapper section in the deploy documentation on the Chef website.

The first issue is that when you connect to github via SSH it wants the Chef client to accept its public fingerprint.  By default, if you don’t modify anything SSH will just sit there waiting for the fingerprint to be accepted.  That is why the SSH Git wrapper is used, it tells SSH on the Chef client that we don’t care about the authentication to the github server, just accept the key.  Here’s what my ssh git wrapper looks like:

 #!/bin/bash 
 exec ssh -o "StrictHostKeyChecking=no" -i "/home/vagrant/.ssh/id_rsa" $1 $2

You just need to tell your Chef recipe to use this wrapper script:

# Set up github to use SSH authentication 
cookbook_file "/home/vagrant/.ssh/wrap-ssh4git.sh" do 
  source "wrap-ssh4git.sh" 
  owner "vagrant" 
  mode 00700 
end

The next problem is that when using key authentication, you must specify both a public and a private key.  This isn’t an issue if you are running the server and configs by hand because you can just generate a key on the fly and hand that to github to tell it who you are.  When you are spinning instances up and down you don’t have this luxury.

To get around this, we create a couple of templates in our cookbook to allow our Chef client to connect to github with an already established public and private key, the id_rsa and id_rsa.pub files that are shown.  Here’s what the configs look like in Chef:

# Public key 
template "/home/vagrant/.ssh/id_rsa.pub" do 
  source "id_rsa.pub" 
  owner "vagrant" 
  mode 0600 
end 
 
# Private key 
template "/home/vagrant/.ssh/id_rsa" do 
  source "id_rsa" 
  owner "vagrant" 
  mode 0600 
end

After that is taken care of, the only other minor caveat is that if you are cloning a huge repo then it might timeout unless you override the default timeout value, which is set to 600 seconds (10 mins).  I had some trouble finding this information on the docs but thanks to Seth Vargo I was able to find what I was looking for. This is easy enough to accomplish, just use the following snippet to override the default value

timeout 9999

That should be it.  There are probably other, easier ways to accomplish this and so I definitely think the adage “there’s more than one way to skin a cat” applies here.  If you happen to know another way I’d love to hear it.

Read More

Podcasts for DevOps admins

podcastGetting up to speed in a fast moving environment forces you to think about things in a different way, which for me was/is an interesting sort of paradigm shift.  Moving from enterprise to start up I have found things to be much different and so embracing the DevOps philosophy and culture has been a main focus of mine through this transition, in a good way of course.  Today I’d like to share some interesting resources that I have found to be immensely helpful in my journey thus far into the land of DevOps.  Hopefully readers are in the same position that I am in and can use this information in their own DevOps journey.

In my experience I have found that podcasts are one of the absolute best ways to consume information, whether it be on a morning commute or viewing the show live, good podcasts are one of the best learning tools around.  So for today’s post, I have compiled a list of some good shows related to DevOps that I hope others find to be useful.

If you’re interested, I wrote a post awhile back focusing on some my favorite podcasts relating to system administration.   You can find the list and original Podcasts for System Administrators post here.

The Food Fight Show

From their website: “Food Fight is a bi-weekly podcast for the Chef community. We bring together the smartest people in the Chef community and the broader DevOps world to discuss the thorniest issues in system administration.”  This show offers some great conversation in topics around DevOps, a lot of really in depth technical discussion from industry experts as well as some great interviews with various contributors to the DevOps community.  This right now is my favorite DevOps podcast and there are a large number of episodes to choose from, so you can hand pick a few episodes to try out if you are skeptical.

DevOps Cafe

This show takes a similar round table format similar to the style of The Food Fight Show.  This show is co-hosted by Damon Edwards and John Willis which covers a lot of cool news and interesting topics on the bleeding edge of the DevOps world.  There is is a nice variety of interesting guests as well as relevant topics of discussion.  I like this show because for me, it does a great job of focusing in on the more relevant aspects of DevOps, rather than the abstract concepts and ideas behind DevOps.  To me, it is more practice than theory.  That might be a horrible description so you’ll just have to go check out the podcast to find out for yourself.

Arrested DevOps

This podcast is in much the same vein as The Food Fight Show, where DevOps pro’s sit down and discuss issues related to what is going on in the DevOps world.  I just started listening to this podcast as it is one of the newer additions to my DevOps podcast scene.  This show definitely has a lot of potential; the hosts are knowledgeable, the guests are smart and the topics of conversation are interesting.  Trevor and Matt do a good job of mixing technical discussion with some of the more DevOps type topics and ideas, I would definitely taking a look at this podcast.

Ops All The Things

Another new kid on the block, this show is hosted by Chris Webber and Steven Murawski of Stack Exchange fame.  The focus of the show is geared towards system administration, operations and DevOps.  I like this podcast because it does a good job of blending DevOps with system administration, which is the track that I followed into the world of DevOps.  Much of the show is geared towards nuts and bolts administration which is nice.  Topics are often in depth and technical, with discussions revolving around things like configuration management, monitoring, revision control, etc.  One other nice feature is that it covers some administration topics related to Windows which I think gives listeners a good perspective.

The Ship Show

I haven’t had a chance to dive too deep into this one much yet but judging from the few episodes I’ve been able to listen to, this show definitely captures a lot of relevant and interesting issues in the community.   Once I get more episodes under my belt and get a better feel for the show I will update the post.  But just to give readers an idea, this is from their bio:  “The Ship Show is a twice-monthly podcast, featuring discussion on everything from build engineering to DevOps to release management, plus interviews, new tools and techniques, and reviews.”

If I missed any or if you’re interested in starting a DevOps oriented podcast let me know and I will be sure to add you to the list and help spread the word.  I think it’s important for people in the community to help out and share their knowledge.

Read More

Introduction to Logstash+ElasticSearch+Kibana

There are a few problems with the current state of logging.  The first is that there is no real unified or agreed upon standard for how to do logging, across software platforms, so it is typically left up to the software designer to choose how to design and output logs.  Because of this non standardized approach, there are many many different formats that logs can become.  Obviously this is an issue if you are attempting to gather useful and meaningful data from a variety of different sources.  Because of this large number of log types and formats, numerous logging tools have been created, all trying to solve a certain type of logging problem, and so selecting one tool that offers everything can quickly become a chore.  The other big problem is that logs can produce an overwhelming amount of information.  Many of the traditional tools do nothing to correlate and represent the data that they collect.  Therefore, narrowing down specific issues can also become very difficult.

Logstash solves both of these problems in its own way.  First, it does a great job of abstracting out a lot of the difficulty with log collection and management.  So for example, you need to collect MySQL logs, Apache logs, and syslogs on a system.  Logstash doesn’t discriminate, you just tell what Logstash to expect and what to expect and it will go ahead and process those logs for you.  With ElasticSearch and Kibana, you can quickly gather useful information by searching through logs and identifying patterns and anomalies in your data.

The goal of this post will be to take readers through the process of getting up and running, starting from scratch all the way up into a working example.  Feel free to skip through any of the various sections if you are looking for something specific.  I’d like to mention quickly that this post covers the steps to configuring Logstash 1.4.0 on an Ubuntu 13.10 system with a log forwarding client on anything you’d like.  You *may* run into issues if you are trying these steps on different versions or Linux distributions.

Installing the pieces:

We will start by installing all of the various pieces that work together to create our basic centralized logging server.  The architecture can be a little bit confusing at first, here is a diagram from the Logstash docs to help.

Logstash architecture

Each of the following components do a specific task:

  • Java – Runtime Environment that Logstash uses to run.
  • Logstash – Collects and processes the logs coming into the system.
  • ElasticSearch – This is what stores, indexes and allows for searching the logs.
  • Redis – This is used as a queue and broker to feed messages and logs to logstash.
  • Kibana – Web interface for searching and analyzing logs stored by ES.

Java

sudo apt-get install openjdk-7-jre

Logstash

cd ~
curl -O https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz
tar zxvf logstash-1.4.2.tar.gz

ElasticSearch (Logstash is picky about which version gets installed)

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.3.2.deb
dpkg -i elasticsearch-1.3.2.deb

Redis

apt-get install redis-server

Kibana

cd ~
wget https://download.elasticsearch.org/kibana/kibana/kibana-3.0.0.tar.gz
tar xvfz kibana-3.0.0.tar.gz

Nginx

sudo apt-get install nginx

Configuring the pieces:

This is an important component in a successful setup because there are a lot of different moving parts here.  If something isn’t working correctly you will want to double check that you have all of your configs setup correctly.

Redis

This step will bind redis to your public interface, so that other servers can connect to it.  Find the line in the /etc/redis-server/redis.conf file

bind 127.0.0.1

and change it to the following:

bind 0.0.0.0

we need to restart redis for it to pick up our change:

sudo service redis-server restart

ElasticSearch

Find the lines in the /etc/elasticsearch/elasticsearch.yml file and change them to the following:

cluster.name: elasticsearch
node.ame: "logstash test"

and restart elasticsearch:

sudo service elasticsearch restart

You can test your config and elasticsearch by browsing to the name/IP of the host and its port http://192.168.1.200:9200

Logstash server

We need to create a config here for the central Logstash.  Let’s create a file called /etc/logstash/server.conf and input the following:

input {
  redis {
    host => "192.168.1.200"
    type => "redis"
    data_type => "list"
    key => "logstash"
  }
}
output {
stdout { }
  elasticsearch {
    cluster => "elasticsearch"
  }
}

Replace host with the local IP address of you redis server, in this case it is on the same host as logstash.

FInally, fire up the Logstash server with the following command:

logstash/bin/logstash --verbose -f /etc/logstash/server.conf

Kibana

You will need to navigate to your Kibana files.  From the installation steps above we chose ~/kibana-3.0.0.  So to get everything working we need to edit a file named config.js in the Kibana directory to point it to the correct host.  Change it from:

elasticsearch: "http://"+window.location.hostname+":9200"

to

elasticsearch: "http://192.168.1.200:9200"

Nginx

We are almost done.  We just have to configure nginx to point to our Kibana website.  To do this we need to copy the kibana directory to the default webserver root.

mkdir /var/www
cp -R ~/kibana-3.0.0 /var/www/kibana

Finally we edit edit the /etc/nginx/sites-enabled/default file and find the following:

root /usr/share/nginx/www;

and change it to read as follows:

root /var/www/kibana;

now restart Nginx:

sudo service nginx restart

Now you should be able to open up a browser and navigate to either http://localhost or to your IP address and get a nice web GUI for Kibana.

Logstash client

We’re almost finished.  We just need to configure the client to forward some logs over to our *other* Logstash server.  Follow the instruction for downloading Logstash as we did earlier on the centralized logging server.  Once you have the files ready to go you need to create a config for the client server.

Again we will create a config file.  This time it will be /etc/logstash/agent.conf and we will use the following configuration:

input {
  file {
    type => "apache-access"
    path => "/var/log/apache2/other_vhosts_access.log"
  }

  file {
    type => "apache-error"
    path => "/var/log/apache2/error.log"
  }
}

filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  date {
  match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}

output {
  stdout { }
  redis {
    host => "192.168.1.200"
    data_type => "list"
    key => "logstash"
  }
}

Let’s fire up our client with the following command:

logstash/bin/logstash --verbose -f /etc/logstash/server.conf agent

If you switch back to your browser and wait a few minutes you should start seeing some logs being displayed.  If you start seeing logs coming in and displayed on the event chart you have it working!

Kibana interface

Conclusion

As I have learned with everything, there are always caveats.  For example, I was getting some strange errors on my client endpoint whenever I ran the logstash agent to forward logs to the central logstash server.  It turned out that Java didn’t have enough RAM and CPU assigned to it to begin with.  You need to be aware that you may run into seemingly random problems if you don’t allocate enough resources to the machine.

Another quick tip when you are encountering issues or things just aren’t working correctly is to turn on verbosity (which we have done in our example) which will enable you to gather some clues to help identify more specific problems you are having.

Resources

http://www.slashroot.in/logstash-tutorial-linux-central-logging-server
http://logstash.net/docs/1.4.0/

Read More

Monitor your Exchange disk sizes

A word to the wise.  If you all of a sudden are unable to send and receive email messages in your Exchange environment, take a look and make sure the Exchange server disks aren’t being filled up.  Today I ran across an interesting (and by interesting I mean that this could have caused a serious outage) issue where Windows updates were very routinely being downloaded for our next patch management installation cycle but unknowingly were also causing our email services to stop functioning correctly.  I am thankful the scenario didn’t get ugly and luckily this event gives me the opportunity to talk about a few of things that I think might be useful for readers and other admins.

It turns out that this month’s wave of Windows updates caused the disks on our Hub Transport servers to quietly fill up during the day, unbeknownst to any of the admins.  In normal circumstances this process is by design and almost never becomes an issue, however in this case there was not enough disk available for Exchange to work correctly.  This could have been disastrous had we not known that the disk was starting to fill up.  We could have been chasing our tails for a much longer period of time and the situation could have escalated to a more stressful situation.  For some reason, the company likes to be able to send and receive emails.  Thank god for monitoring that works.

There are a couple things that need to be investigated at this point.  First, had we not known that the Windows updates were what were causing the disk to fill up, a logical place to start looking for clues would be to examine the log files on the suspect servers.  I would like to take a little bit of time and quickly go over some steps for looking at logs in an Exchange environment, when thinking about potential disk space issues a few things come to mind.  Are log files growing rapidly?  Did somebody turn on verbose logging and accidentally forget to turn it off?  To verify the logs aren’t the issue there are a few places that are good to look.  If you are familiar with or have ever used message tracking in Exchange you know how powerful it can be.  Sometimes that can also potentially be an issue with your disk filling up.  Here is the location that these message tracking logs are stored:

C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\MessageTracking

Another location that gets used when you turn on verbose logging for troubleshooting send or receive connectors are the smtpsend and smtpreceive directories.  These can fill up quite quickly if you forget to turn off verbose logging on a send or receive connector when are you done troubleshooting.  This location is here:

C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\ProtocolLog

Finally, there is a location for logging protocol settings on the hub transport.  These logs can be found here:

C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\ProtocolLog

I would like to point out quickly that any and all of the behaviors of these logging methods can be modified using the Exchange Management Shell, and sometimes for more detailed settings can only be modified by the EMS.

If these quick spot checks don’t uncover any immediate problems another good technique to help gain some insight into where your disk space issues are is to use a tool that enumerates file locations and file sizes.  There are a few tools available, one of them I like to use is Space Sniffer.  It is fast, easy to use and gives a good visual representation of directory sizes and file sizes.  The tool can do much more but in this case we are just interested in finding the disk issue quickly.  We were able to quickly find that the size and contents of the %windir%\softwaredistribution\download folder were growing rather quickly.  I just happen to know that this is the temporary location that Windows uses to store Windows update files before they are installed.

There are a few things that can be done here.  You can either clear the temporary Windows updates files, delete other unnecessary files or you can grow your disks.  We were lucky because our Hub Transport servers are VM’s and increasing the disk size of these servers is simple.  That seems like the best option if it is a possibility, just in case something like this happens again we will have the additional space so the Exchange servers won’t bog down.

Ultimately we prevented the disaster from occurring but the incident is a great illustration of the lesson I’d like to share.  Make sure you have a good monitoring and alerting solution in place.  Otherwise you may not have any clue where to start looking.  If we did not have a reliable monitoring tool in place it would have been much more difficult to track this problem down in the first place because our Exchange environment is large and complex.  Because we have good monitoring tools we were able to quickly identify the problem and resolve it before anything bad happened.  On a side note, I am still thinking about how we can take this monitoring and alerting one step further in the future to become proactive instead of reactive but for now the monitoring tools are doing their job and because of this we avoided a potential disaster.  If you have any thoughts on proactive monitoring and alerting relating to these types of disk issues let me know, I’d love to hear how you handle it.

Read More

Why Computer Science degrees translate to System Administration

I run across a lot of articles and posts that talk about how a degree in Computer Science is usually irrelevant to system administration and that you are just as well off with another degree or no degree at all. I think that line of logic is very short sighted and today I am going (or at least attempt) to explain why. By no means am I criticizing these approaches, in fact I believe in the logic that there is more than one way to skin a cat, and I have found many other highly successful admins that have reached their positions by these alternate means. I just want to quickly clarify that I am not advising readers that taking the CS route to becoming a system admin is the only, right way to go, I am simply relating my own experiences in system administration to my background in CS and making a case of why pursuing a degree in Computer Science, or any other degree in engineering for that matter isn’t going to hurt your chances of becoming a sysadmin.

When you think of Computer Science you think of programming or maybe math, at least I do. Most CS programs these days have a heavy orientation towards programming and the scientific and mathematic applications of programming as it applies to the world around us. As an aside, I am beginning to see many more programs that are tailored to specific disciplines inside the realm of IT which looks promising. This is a great hybrid approach in my opinion because it gives students a chance to look at a few alternate options. Coding isn’t my passion so having an option to become a system administrator without the amount of intense coding from a CS program looks like an attractive approach.

It is true that many of the mundane daily tasks related to system administration don’t involve 8 hours a day of reading and writing code. Because of this I think it is important to characterize and distinguish a sysadmin as somebody who relies on software tools and programming to solve problems and technical challenges but doesn’t necessarily devote all of their time and energy to living in and interacting with code. The relationship of the sysadmin to programming is more of an indirect one, though still very important.

The farther along I wander on in my journey as a sysadmin the more I realize how the CS background is helping me.  I have a solid foundation in many of the core concepts that were taught through the CS program, which in turn  have indirectly influenced my abilities as a system administrator for the better. The first and most valuable asset my CS background has given me is the ability to write and understand code.  This is extremely useful in my daily slew of activities.  It allows me to approach problems with a programmatic methodology, it allows me to automate redundant and repeatable tasks with scripts, it gives me intuition into why databases or programs are slow, it allows me to debug issues systematically, and on and on.  Obviously these skills can be learned elsewhere but having them rolled up into your education when you learn about Computer Science as part of the package deal is very convenient.  I would much rather have this set of skills and have the ability to look at things from a different perspective than have to learn each of these techniques separately.  There is no way that somebody coming from a business or other similar background will know about silly things like big O notation or how different algorithms work at a fundamental level, it just isn’t part of their background so they don’t spend time thinking about these things.

This really parlays into other areas well and you are setting yourself up for a diversified and broad horizon for future employment prospects. For example, take a pure sysadmin that knows no programming or CS; at their core they know system administration. But what if they either get burnt out (which is common in this profession) or they don’t keep up the skills to match their position? There is nowhere in the industry for these individuals to turn, unless they want to go into management. That is why I believe individuals that choose not to further their careers are essentially crippling themselves and their future prospects by not knowing how or learning to program, or to at least understand how system administration and programming can relate to each other. With a diverse background the CS sysadmin could potentially move into a Devops role, a pure programming and development role or a management role. With the diverse IT ecosystem, programming and development skills are very much saught after and so the demand is high for these other types of positions and sets of skills.

Another well known fact in the IT industry, which I don’t necessarily agree with but nonetheless exists, is the fact that just having a CS degree will open doors that may not otherwise be open without a degree. I personally believe that a degree shouldn’t dictate your position but by having a degree you set yourself up for some unique opportunities and certainly are not hurting yourself. For example, all other things being equal, somebody scanning through resumes has to select an individual applicant that either has a degree in Computer Science or a degree in Philosophy. Which do you think will be picked? Like I said, I don’t think the hiring process is fair or even has anything to do with skill but can be used as a way to get ahead of the competition in the hiring process and can therefore a degree be valuable by itself as well as viewed as a strategic component in the hiring process if nothing else.

Here’s what I am saying. You don’t have to have a degree in Computer Science to be a great System Administrator. But the CS background definitely equips you with the tools to both understand some of the more abstract technical concepts and ideas and give you a robust framework working through and solving these difficult and complex problems. Ultimately the most important factors in being a good sysadmin (let alone anything else) is a combination of many different things, including a willingness to learn and the amount of experience an individual possesses. There is no cookie cutter way to build the perfect sysadmin and you will invariably find a very diverse group of people in this profession, but a head start with a CS degree is certainly one path that won’t hurt you and is a good attribute of many good sysadmins.

Read More