Category Archives: Bash

Patching CVE-6271 (shellshock) with Chef

If you haven’t heard the news yet, a recently disclosed vulnerability has been released that exploits environmental variables in bash.  This has some far reaching implications because bash is so widespread and runs on many different types of devices, for example network gear, routers, switches, firewalls, etc.  If that doesn’t scare you then you probably don’t need to finish reading this article.  For more information you can check out this article that helped to break the story.

I have been seeing a lot “OMG the world is on fire, patch patch patch!” posts and sentiment surrounding this recently disclosed vulnerability, but basically have not seen anybody taking the time to explain how to patch and fix this issue.  It is not a difficult fix but it might not be obvious to the more casual user or those who do not have a sysadmin or security background.

Debian/Ubuntu:

Use the following commands to search through your installed packages for the correct package release.  You can check the Ubuntu USN for versions.

dpkg -l | grep '^ii' or
dpkg-query --show bash

If you are on Ubuntu 12.04 you will need update to the following version:

bash    4.2-2ubuntu2.3

If you are on Ubuntu 13.10, and have this package (or below), you are vulnerable.  Update to 14.04!

 bash 4.2-5ubuntu3

If you are on Ubuntu 14.04, be sure to update to the most recently patched patch.

bash 4.3-7ubuntu1.3

Luckily, the update process is pretty straight forward.

apt-get update
apt-get --only-upgrade install bash

If you have the luxury of managing your environment with some sort of automation or configuration management tool (get this in place if you don’t have it already!) then this process can be managed quite efficiently.  For example, in a Chef infrastructure you can blast out the update with the following command:

knife ssh 'platform_family:debian' 'sudo apt-get update; sudo apt-get install -y bash'; knife ssh 'platform_family:redhat' 'sudo yum -y install bash'

This will iterate over every server in your Chef server environment that is in the Debain family (including Ubuntu) or RHEL family (including CentOS) and update the server packages so that the latest patched bash version gets pulled down and then gets updated to the latest version.

You may need to tweak the syntax a little, -x to override the ssh user and -i to feed an identity file.  This is so much faster than manually installing the update on all your servers or even fiddling around with a tool like Fabric, which is still better than nothing.

One caveat to note:  If you are not on an LTS version of Ubuntu, you will need to upgrade your server(s) first to an LTS, either 12.04 or 14.04 to qualify for this patch.  Ubuntu 13.10 went out of support in August which was about a month ago as per the time of this writing so you will want to get your OS up date.

One more thing:  The early patches to address this vulnerability did not entirely fix the issue, so make sure that you have the correct patch installed.  If you patched right away there is a good chance you may still be vulnerable, so simply rerun your knife ssh command to reapply the newest patch, now that the dust is beginning to settle.

Outside of this vulnerability, it is a good idea to get your OS on an Ubuntu LTS version anyway to continue receiving critical updates for software as well as security patches for a longer duration than the normal, 6 month release cycle of the server distribution.

7 useful but hard to remember Linux commands

I have found myself using these commands over and over so I decided I’d take the time to go ahead and document them for future me as well as readers because I find these commands pretty useful.  I just always manage to forget them, hence the title of the post.  The smart thing to do would be to create aliases for these commands but I have just been too lazy and some of them are run across different servers so it isn’t always a convenient option.

Anyway, let’s go ahead and run through the commands before I forget…

1) du -ah / | sort -n -r | head -n 50

This one is really handy for debugging space issues.  It will list the top 50 files according to file size, with the largest at the top of the list.Notice the “/” will specify the location to search so you can easily modify this one to search different locations, like “/var/log” for example if you are having trouble with growing log files.

2) git checkout — .

I don’t use this one very often, which is probably why I manage to forget it so easily.  But I really like it.  Sometimes I will be working on a git repo across different machines at the same time and will run in to conflicts committing to the repo or more likely I committed changes on one machine and just need to pull down the newest changes but can’t since I have made modifications.  For those scenarios you can run the above command quickly reset your git changes quickly and easily.

3) tmux kill-window -t 3

i use tmux for my terminal and window manager on all my workstations and love it.  If you haven’t heard of it, take a look here.  Sometimes the sessions can get stuck so it becomes necessary to close the window without destroying the tmux session.  Again, this doesn’t happen very often so it is sometimes hard for me to remember the exact syntax but this one is a handy little trick for managing tmux windows and sessions.

4) grep -r “text”

I know, I should really have this one memorized by now.  I am trying to remember but I don’t find myself using this one all that often even though it is really powerful and useful.  This will essentially search through every file recursively and spit out the text pattern that you feed to it.

5) kill $(pgrep process)

This one is handy when there are a large number of stuck processes and you need to blow them all out with one command.  For example if the chrome browser ever gets stuck with a million tabs open, there are likely a large number of processes all with the same – or similar names.  If you pass all or part of the process name in to this command pgrep will find them and kill will destroy them

6) docker rm $(docker ps -a -q)

I have been using Docker more and more recently and every once in awhile I find myself with a large number of dead Docker processes that need to be cleaned up.  This command will blow out all of these stale processes at once.  This is nice because Docker processes take up a large amount of disk space and often times can fill up your drives without you being aware.  I have been able to reclaim large amounts of disk space with this command.

7) watch -n 10 df -ah

This is another good one for checking disk space issues.  It will update you every ten seconds with the disk utilization of the system.  Pretty straight forward but a great tool to help troubleshooting space issues.

That’s all I have for now, there are lots more but these are the most useful ones that I find myself forgetting the most often, hopefully this post will serve as a nice reminder.  If you have any cool or useful commands that you would like to share feel free to comment and I will update the post to include them.

Setting up a private git repo in Chef

As a Chef newbie I really might have bit off more than I could chew originally when I was looking at how to get private github repo’s working but am glad I pushed through and got a solution working.  I have no idea if this is the preferred method or if there are any easier ways but this worked for me and so I am hoping that if any other Chef newbies stumble across this problem then they can use this post as a guide or reference.

First, I’d like to give credit where it is due.  I used this post as a template as well as the SSH wrapper section in the deploy documentation on the Chef website.

The first issue I had problems with, is that when you connect to github via SSH it wants the Chef client to accept its public fingerprint.  By default, if you don’t modify anything SSH will just sit there waiting for the fingerprint to be accepted.  That is why the SSH Git wrapper is used, it tells SSH on the Chef client that we don’t care about the authentication to the github server, just accept the key.  Here’s what my ssh git wrapper looks like:

 #!/usr/bin/env bash 
 /usr/bin/env ssh -o "StrictHostKeyChecking=no" -i "/home/vagrant/.ssh/id_rsa" $1 $2

You just need to tell your Chef recipe to use this wrapper script:

# Set up github to use SSH authentication 
cookbook_file "/home/vagrant/.ssh/wrap-ssh4git.sh" do 
  source "wrap-ssh4git.sh" 
  owner "vagrant" 
  mode 00700 
end

The next problem is that when using key authentication, you must specify both a public and a private key.  This isn’t an issue if you are running the server and configs by hand because you can just generate a key on the fly and hand that to github to tell it who you are.  When you are spinning instances up and down you don’t have this luxury (actually you might but it seemed like a pain in the ass).

To get around this, we create a couple of templates in our cookbook to allow our Chef client to connect to github with an already established public and private key, the id_rsa and id_rsa.pub files that are shown.  Here’s what the configs look like in Chef:

# Public key 
template "/home/vagrant/.ssh/id_rsa.pub" do 
  source "id_rsa.pub" 
  owner "vagrant" 
  mode 0600 
end 
 
# Private key 
template "/home/vagrant/.ssh/id_rsa" do 
  source "id_rsa" 
  owner "vagrant" 
  mode 0600 
end

After that is taken care of, the only other minor caveat is that if you are cloning a huge repo then it might timeout unless you override the default timeout value, which is set to 600 seconds (10 mins).  I had some trouble finding this information on the docs but thanks to Seth Vargo I was able to find what I was looking for. This is easy enough to accomplish, just use the following snippet to override the default value

timeout 9999

That should be it.  There are probably other, easier ways to accomplish this and so I definitely think the adage “there’s more than one way to skin a cat” applies here.  If you happen to know another way I’d love to hear it.

Protip January: Get your external IP from the command line

Ever need to grab your IP quick but don’t want to get out of the command line or stop whatever you’re working on?  Or how about if you have SSH’d into a number of different servers and you simply want to know where you are at currently?  This little trick enables you to quickly determine your public IP address without leaving the command line.

I’ll admit, I didn’t originally come up with this one, but liked it so much that I decided to write a quick post about it because I thought it was so nice and useful. There is a great website called commandlinefu.com where users can post all their slick one liners, which is where I found this one.  If you haven’t been there before I highly recommend it, there is some really good stuff over there.

This one is simple yet quite useful, which is what I’m all about.  The command uses curl, so if you don’t have that bad boy installed yet you’ll need to go get that quick (Debian based distros).

sudo aptitude install curl

Once that is installed simply run the following:

curl ifconfig.me

And bam!  Emeril style.  Let that go out and do its thing and you will quickly have your external IP address.  I like this method a lot more than having to jump out of the shell and open up a browser then going to a website to get this information.  It might not save that much time but to me just knowing how to do this is useful and knowledge is power.  Or something.

Document storage: Part 6

Document Storage Project

This is Part 6: Tying it all together.

All that’s left to do now is write a script that will:

  • Detect when a new file’s been uploaded.
  • Turn it into a searchable PDF with OCR.
  • Put the finished PDF in a suitable directory so we can easily browse for it later.

This is actually pretty easy. inotifywait(1) will tell us whenever a file’s been closed, we can use that as our trigger to OCR the document.

Our script is therefore in two parts:

Part 1: will watch the /home/incoming directory for any files that are closed.
Part 2: will be called by the script in part 1 every time a file is created.

Part 1

This script lives in /home/scripts and is called watch-dir.

#!/bin/bash
INCOMING="/home/incoming"
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

inotifywait -m --format '%:e %f' -e CLOSE_WRITE "${INCOMING}"  2>/dev/null | while read LINE
do
        FILE="${INCOMING}"/`echo ${LINE} | cut -d" " -f2-`
        "${DIR}"/process-image "${FILE}" &
done

Part 2

This script lives in /home/scripts and is called process-image.

#!/bin/bash

# Dead easy - at least in theory!
# Take a single argument - filename of the file to process. 
# Do all the necessary processing to make it a 
# searchable PDF.

OUTFILE="`basename "${1}"`"
TEMPFILE="`mktemp`"

if [ -s "${1}" ]
then
	# We use the first part of the filename as a classification.
	CLASSIFICATION=`echo ${OUTFILE} | cut -f1 -d"-"`
	OUTDIR="/home/http/documents/${CLASSIFICATION}/`date +%Y`/`date +%Y-%m`/`date +%Y-%m-%d`"

	if [ ! -d "${OUTDIR}" ]
	then
		mkdir -p "${OUTDIR}" || exit 1
	fi

	# We have to move our file to a temporary location right away because 
	# otherwise pdfsandwich uses the file's own location for 
	# temporary storage. Well and good - but the file's location is 
	# subject to an inotify that will call this script!

	mv "${1}" "${TEMPFILE}" || exit 1

	# Have we a colour or a mono image? Probably quicker to find out 
	# and process accordingly rather than treat everything as RGB.
	# We assume the first page is representative of everything
        COLOURDEPTH=`convert "${TEMPFILE}[0]" -verbose -identify /dev/null 2>/dev/null | grep "Depth:" | awk -F'[/-]' '{print $2}'`
	if [ "${COLOURDEPTH}" -gt 1 ]
	then
		SANDWICHOPTS="-rgb"
	fi
	pdfsandwich ${SANDWICHOPTS} -o "${OUTDIR}/${OUTFILE}" "${TEMPFILE}" > /dev/null 2>&1
	rm "${TEMPFILE}"
fi

There’s just one thing missing: pdfsandwich. This is actually something I found elsewhere on the web. It hasn’t made it into any of the major distro repositories as far as I can tell, but it’s easy enough to compile and install yourself. Find it here.

Run /home/scripts/watch-dir every time we boot – the easiest way to do this is to include a line in /etc/rc.local that calls it:

/home/scripts/watch-dir &

Get it started now (unless you were planning on rebooting):

nohup /home/scripts/watch-dir &

Now you should be able to scan in documents, they’ll be automatically OCR’d and made available on the internal website you set up in part 3.

Further enhancements are left to the reader; suggestions include:

  • Automatically notifying sphider-plus to reindex when a document is added. (You’ll need a newer version of sphider-plus to do this. Unfortunately there is a cost associated with this, but it’s pretty cheap. Get it from here).
  • There is a bug in pdfsandwich (actually, I think the bug is probably in tesseract or hocr2pdf, both of which are called by pdfsandwich): under certain circumstances which I haven’t been able to nail down, sometimes you’ll find that in the finished PDF one page of a multi-page document will only show the OCR’d layer, not the original document. Track down this bug, fix it and notify the maintainer of the appropriate package so that the upstream package can also be fixed.
  • This isn’t terribly good for bulk scanning – if you want to scan in 50 one-page documents, you have to scan them individually otherwise they’ll be treated as a single 50 page document. Edit the script so we can somehow communicate with it that certain documents should be split into their constituent pages and store the resulting PDFs in this way.
  • Like all OCR-based solutions, this won’t give you a perfect representation of the source text in the finished PDF. But I’m quite sure the accuracy can be improved, very likely without having to make significant changes to how this operates. Carry out some experiments to figure out optimum settings for accuracy and edit the scripts accordingly.