Selecting a Cloud Provider

Well the deed is done.  I finally have migrated the blog off my home server and onto a hosted provider.  The site is finally starting to get big enough that I felt like a migration would be a step in the right direction.  When I originally created this blog it was more of an experiment than anything else, but over the past 2 years I have seen the blog and my writing grow in directions I really didn’t anticipate when it was created, which is very exciting for me.

Due to this growth, one of my main concerns is stability.  With the stage that this project is in now I just want a place that has good bandwidth and is available 24×7 when I want to write something.  I don’t want to have to worry about the power going out at my home or any sort of service disruption to my internet to cause my site to be unreachable.  My traffic numbers are relatively low if compared to some other popular websites but they aren’t anything to sneeze at either now that the blog and my writings have become more established and so it is important for me to have the site available all the time to people.  If you’re interested in the migration comment or send me a note and I can do a quick write up or go into more depth, but I figured I’d spare the details because there are already a number of good guides out there on how to do it.  The only real issue was upgrading from Apache 2.2 -> 2.4.

I’d like to take some time and talk about moving to a cloud provider.  It may be an unfamiliar process to some so there might be a few good takeaways by covering the topic briefly.  Cloud hosting and cloud technologies are evolving to be much more than just a fad, and a lot of companies are trying to position themselves for the next generation of cloud computing moving forward.  Choosing a cloud provider offers a number of benefits including lower over head and maintenance costs with running servers and a data center.  It also alleviates infrastructure maintenance, stability issues and dealing with hardware failures and troubleshooting.

There are a very large number of cloud providers out there currently and the competition is fierce.  In fact the competition is so fierce right now that Google and Amazon have recently begun a price war.

  • Heroku
  • Amazon AWS
  • Joyent
  • Azure
  • Rackspace
  • Linode
  • DreamHost
  • Google Cloud Platform
  • Digital Ocean

In my current view they are all great in their own way.  By that I mean that each of these providers can provide value in their own way.  If you are a business looking to move to the cloud the AWS is the way to go.  The Amazon cloud has been tested and vetted by some of the largest cloud companies (Netflix, Pinterest, LinkedIn).  It has been around for a very long time in cloud years so Amazon has been able to work out most of the issues as well create a golden standard.  The trade off is that for somebody new to cloud computing the services and interface can be confusing.  There are a lot of bells and whistles, which many people do not need.

If you’re a Microsoft shop you might take a look at the Azure platform.  Azure does a really good job of integrating with other Microsoft products and services.  This would be a logical move for anybody that leverages Microsoft technologies.

Rackspace and Joyent both leverage OpenStack for their underlying architecture.  OpenStack is open source software so there are some really interesting things revolving around that platform and technology.

Ultimately I decided to go with Digital Ocean for this project for a couple of reasons.  First, the price was there.  My blog doesn’t require a lot of horsepower so I was able to spin up a 1GB, 1CPU, 30GB, 1TB bandwidth Ubuntu server on DO for $10/month.  The second reason that I like DO and which many other people are moving towards DO is that the setup and configuration process is stupidly simple and easy.  Create an account, setup a credit card and away you go, up into the clouds.  The process from start to having a new server up literally took me 10 minutes.

That is the beauty of DO’s approach to cloud.  Make things as simple as possible for people to get up and going.  Certainly there aren’t nearly as many features as many other platforms but for many scenarios people just want a server to play around with, and DO does a great job of making that possible.

The point I’m trying to get at here is that there are basically different tools for different jobs.  You need to evaluate what all is out there and how it will suit your needs.  If you aren’t a Microsoft shop then you might not need to use Azure.  The good news is that with all of this competition and rivalry prices are dropping and more options and niches are becoming available as products mature and as new providers enter the scene.  The cloud introduces some pretty neat features and technologies but ultimately you need to decide what you or your business is looking for before you make a decision.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Setting up a GitHub webhook in Jenkins

This post will detail the steps to have Jenkins automatically create a build if it detects changes to  a GitHub repository.  This can be a very useful improvement to your continuous integration setup with Jenkins because this method is only telling Jenkins to attempt a new build when a change is detected rather than polling on an interval, which can be a little bit inefficient.

There are a few steps necessary to get this process working correctly that I would like to highlight in case I have to do this again or if anybody else would like to set this up.  Most of the guides that I found were very out of date so their instructions were a little bit unclear and misleading.

The first step is to configure Jenkins to talk to GitHub.  You will need to download and install the GitHub plugin (I am using version 1.8 as of this writing).  Manage Jenkins -> Manage Plugins -> Available -> GitHub plugin

GitHub plugin

After this is installed you can either create a new build or configure an existing build job.  Since I already have one set up I will just modify it to use the GitHub hook.  There are a few things that need to be changed.

First, you will need to add your github repo:

source code management

Then you will then have to tick the box indicated below – “Build when a change is pushed to GitHub”

build when github changes

Also note that Jenkins should have an SSH key already associated with the desired GitHub project.

You’re pretty close to being done.  The final step is to head over to GitHub and adjust the settings for the project by creating a webhook for your Jenkins server.  Select the repo you’re interested in and click Settings.  If you aren’t an admin of the repo you will not be able to modify the settings, so talk to an owner to either finish this step for you or have them grant you admin to make the change yourself.

The GitHub steps are pretty straight forward.  Open the “Webhooks & Services” tab -> choose “Configure Services” -> find the Jenkins (GitHub plugin option) and fill it in with a similar URL to the following:

  • http://<Name of Jenkins server>:8080/github-webhook/

jenkins webhook

Make sure to tick the active box and ensure it works by running the “Test Hook”.  If it comes back with a payload deployed message you should be good to go.

UPDATE

I found an issue that was causing us issues.  There is a check box near the bottom of the authentication section labeled “Prevent Cross Site Request Forgery exploits” that needs to be unchecked in order for this particular method to work.

Disable forgery option

Let me know if you have any issues, I haven’t found a good way to debug or test outside of the message returned from the GitHub configuration page.  I did find another alternative method that may work but didn’t need to use it so I can update this if necessary.

If you want more details about web hooks you can check out these resources:

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

What is DevOps?

Since landing myself in a new and unexplored terrain as a freshly minted DevOps admin, I have been thinking a lot about what exactly DevOps is and how I will translate my skills moving into the position.  I am very excited to have the opportunity to work in such a new and powerful area of IT (and at such a sweet company!) but really think I need to lay out some of the groundwork behind what DevOps is, to help strengthen my own understanding and hopefully to help others grasp some of the concepts and ideas behind it.

I have been hearing more and more about DevOps philosophy and its growing influence and adoption in the world of IT, especially in fast paced, cloud and start up companies.  From what I have seen so far, I think I people really need to start looking at the impact that DevOps is making in the realm of system administration and how to set themselves up to succeed in this profession moving forward.

Here is the official DevOps description on Wikipedia:

DevOps is a software development method that stresses communication, collaboration and integration between software developers and information technology (IT) professionals.  DevOps is a response to the interdependence of software development and IT operations. It aims to help an organization rapidly produce software products and services.

While this is a solid description, there still seems to be a large amount of confusion about what exactly DevOps is so I’d like to address some of the key ideas and views that go along with its mentality and application to system administration.  To me, DevOps can be thought of as a combination of the best practices that a career in operations has to offer with many of the concepts and ideas that are used in the world of development.  Especially those derived from Agile and Scrum.

The great thing about DevOps is that since it is so new, there is really no universally accepted definition of what it is limited to.  This means that those who are currently involved in the DevOps development and adoption are essentially creating a new discipline, adding to it as they go.  A current DevOps admin can be described in simple terms as a systems admin that works closely with developers to decrease the gap between operations and development.  But that is not the main strength that DevOps offers and really just hits the tip of the ice burg for what DevOps actually is and means.

For one, DevOps offers a sort of cultural shift in the IT environment.  Traditionally in IT landscapes, there has been somewhat of a divide between operations and development.  You can think of this divide as a wall built between the dev and the ops teams either due to siloing of job skills and responsibilities or how the organization at broader perspective operates.  Because of this dissection of duties, there is typically little to no overlap between the tool sets or thought process between the dev or ops teams, which can cause serious headaches trying to get products out the door.

So how do you fix this?

In practical application, the principals of DevOps can put into practice using things like Continuous Integration tools , configuration management, logging and monitoring, creating a standardized test, dev and QA environments, etc.  The DevOps mindset and culture has many of its roots in environments of rapid growth and change.  An example of this philosophy put in to practice is at start up companies that rely on getting their product to market as quickly as smoothly as possible.  The good news is that larger enterprise IT environments are beginning to look at some of the benefits of this approach and starting to tear down the walls of the silos.

Some of the benefits of DevOps include:

  • Increased stability in your environment (embracing config management and version control)
  • Faster resolution of problems (decrease MTT)
  • Continuous software delivery (increasing release frequency brings ideas to market faster)
  • Much faster software development life cycles
  • Quicker interaction and feedback loops for key business stakeholders
  • Automate otherwise cumbersome and tedious tasks to free up time for devs and ops teams

These are some powerful concepts.  And the benefits here cannot be underestimated because at the end of the day the company you work for is in the business of making money.  And the faster they can make changes to become more marketable and competitive in the market the better.

One final topic I’d like to cover is programming.  If you are even remotely interested in DevOps you should learn to program, if you don’t know already.  This is the general direction of the discipline and if you don’t have a solid foundation to work from you will not be putting yourself into the best position to progress your career.  This doesn’t mean you have to be a developer, but IMO you have have to at least know and understand what the developers are talking about.  It is also very useful to know programming for all of the various scripting and automation tasks that are involved in DevOps.  Not only will you be able to debug issues with other software, scripts and programs but you will be a much more valuable asset to your team if you can be trusted to get things done and help get product shipped out the door.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Exchange Transport Service won’t start

Due to an outage this weekend, I’d like to take a minute to briefly describe the scenario that occurred and how it was resolved.  If you are having trouble starting your Exchange Transport Service then you may potentially be running into the same issue I was having during the outage.  Luckily there is an easy remedy for the service failing to start.  Basically what was happening was the Exchange message queue database was beginning to fail due to some sort of corruption, causing the Transport service to fail.  Because the Transport service wasn’t running, the Edge Sync process was failing, causing external mail delivery to fail.  Obviously a big issue, since you cannot receive any email from external domains if this is not working correctly.

To troubleshoot this, there are a few obvious signs that you should look at first.  The main thing you should check first is your disk sizes, I wrote about it in my previous post.  If your disks are full or are filling up then you are pretty much dead in the water and will need to fix your disk issue.  In my scenario the disk sizes were not an issue so the next tool I turned to were the logs.  I found a number of interesting entries in the Windows Application Event logs that gave me some clues.  I want to detail as many of these messages as I can so that people who are having similar issues know what to look for.

Transport error Transport error Transport error Transport error

There are a few possible resolutions to this problem.  Through some Google searches one solution I found is that you can attempt to repair the corruption in the queue databases by running the database through ESE util.  There is no guarantee this will work and it can potentially take a lot of time, depending on the size of your queue database. There is some good information here about the mail queue and how it works.

If you decide to repair the database, the mail queue file is located in the following location:

C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\data\Queue

Inside this directory is a file called tmp.edb.  This is the file that you will need to repair.

The other method is much simpler and was the solution I went with.  Instead of attempting to repair the database corruption, simply copy and rename the queue folder and restart the Transport service.  Doing this will force the Transport service to create a new, fresh copy of the database queue along with all of the accompanying config files and associated items that are required to get things up and running.  It is faster and simpler, IMO.  The only problem with this approach is that items that were stuck in the queue when the database corruption occurred will be lost.  For me, this was an acceptable loss.  If not, you will probably have to use the first method and attempt to repair the database or try to somehow work with a shadow copy or backup somehow to get unstuck.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.

Monitor your Exchange disk sizes

A word to the wise.  If you all of a sudden are unable to send and receive email messages in your Exchange environment, take a look and make sure the Exchange server disks aren’t being filled up.  Today I ran across an interesting (and by interesting I mean that this could have caused a serious outage) issue where Windows updates were very routinely being downloaded for our next patch management installation cycle but unknowingly were also causing our email services to stop functioning correctly.  I am thankful the scenario didn’t get ugly and luckily this event gives me the opportunity to talk about a few of things that I think might be useful for readers and other admins.

It turns out that this month’s wave of Windows updates caused the disks on our Hub Transport servers to quietly fill up during the day, unbeknownst to any of the admins.  In normal circumstances this process is by design and almost never becomes an issue, however in this case there was not enough disk available for Exchange to work correctly.  This could have been disastrous had we not known that the disk was starting to fill up.  We could have been chasing our tails for a much longer period of time and the situation could have escalated to a more stressful situation.  For some reason, the company likes to be able to send and receive emails.  Thank god for monitoring that works.

There are a couple things that need to be investigated at this point.  First, had we not known that the Windows updates were what were causing the disk to fill up, a logical place to start looking for clues would be to examine the log files on the suspect servers.  I would like to take a little bit of time and quickly go over some steps for looking at logs in an Exchange environment, when thinking about potential disk space issues a few things come to mind.  Are log files growing rapidly?  Did somebody turn on verbose logging and accidentally forget to turn it off?  To verify the logs aren’t the issue there are a few places that are good to look.  If you are familiar with or have ever used message tracking in Exchange you know how powerful it can be.  Sometimes that can also potentially be an issue with your disk filling up.  Here is the location that these message tracking logs are stored:

C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\MessageTracking

Another location that gets used when you turn on verbose logging for troubleshooting send or receive connectors are the smtpsend and smtpreceive directories.  These can fill up quite quickly if you forget to turn off verbose logging on a send or receive connector when are you done troubleshooting.  This location is here:

C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\ProtocolLog

Finally, there is a location for logging protocol settings on the hub transport.  These logs can be found here:

C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\Logs\ProtocolLog

I would like to point out quickly that any and all of the behaviors of these logging methods can be modified using the Exchange Management Shell, and sometimes for more detailed settings can only be modified by the EMS.

If these quick spot checks don’t uncover any immediate problems another good technique to help gain some insight into where your disk space issues are is to use a tool that enumerates file locations and file sizes.  There are a few tools available, one of them I like to use is Space Sniffer.  It is fast, easy to use and gives a good visual representation of directory sizes and file sizes.  The tool can do much more but in this case we are just interested in finding the disk issue quickly.  We were able to quickly find that the size and contents of the %windir%\softwaredistribution\download folder were growing rather quickly.  I just happen to know that this is the temporary location that Windows uses to store Windows update files before they are installed.

There are a few things that can be done here.  You can either clear the temporary Windows updates files, delete other unnecessary files or you can grow your disks.  We were lucky because our Hub Transport servers are VM’s and increasing the disk size of these servers is simple.  That seems like the best option if it is a possibility, just in case something like this happens again we will have the additional space so the Exchange servers won’t bog down.

Ultimately we prevented the disaster from occurring but the incident is a great illustration of the lesson I’d like to share.  Make sure you have a good monitoring and alerting solution in place.  Otherwise you may not have any clue where to start looking.  If we did not have a reliable monitoring tool in place it would have been much more difficult to track this problem down in the first place because our Exchange environment is large and complex.  Because we have good monitoring tools we were able to quickly identify the problem and resolve it before anything bad happened.  On a side note, I am still thinking about how we can take this monitoring and alerting one step further in the future to become proactive instead of reactive but for now the monitoring tools are doing their job and because of this we avoided a potential disaster.  If you have any thoughts on proactive monitoring and alerting relating to these types of disk issues let me know, I’d love to hear how you handle it.

About the Author: Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.