Setting up a private git repo in Chef

It turns out that cloning and managing private git repo’s in Chef is not as easy as it looks.  That said, I have a method that works.  I have no idea if this is the preferred method or if there are any easier ways but this worked for me, so let me know if there is an easier way and I will be glad to update this post.

First, I’d like to give credit where it is due.  I used this post as a template as well as the SSH wrapper section in the deploy documentation on the Chef website.

The first issue is that when you connect to github via SSH it wants the Chef client to accept its public fingerprint.  By default, if you don’t modify anything SSH will just sit there waiting for the fingerprint to be accepted.  That is why the SSH Git wrapper is used, it tells SSH on the Chef client that we don’t care about the authentication to the github server, just accept the key.  Here’s what my ssh git wrapper looks like:

 #!/bin/bash 
 exec ssh -o "StrictHostKeyChecking=no" -i "/home/vagrant/.ssh/id_rsa" $1 $2

You just need to tell your Chef recipe to use this wrapper script:

# Set up github to use SSH authentication 
cookbook_file "/home/vagrant/.ssh/wrap-ssh4git.sh" do 
  source "wrap-ssh4git.sh" 
  owner "vagrant" 
  mode 00700 
end

The next problem is that when using key authentication, you must specify both a public and a private key.  This isn’t an issue if you are running the server and configs by hand because you can just generate a key on the fly and hand that to github to tell it who you are.  When you are spinning instances up and down you don’t have this luxury.

To get around this, we create a couple of templates in our cookbook to allow our Chef client to connect to github with an already established public and private key, the id_rsa and id_rsa.pub files that are shown.  Here’s what the configs look like in Chef:

# Public key 
template "/home/vagrant/.ssh/id_rsa.pub" do 
  source "id_rsa.pub" 
  owner "vagrant" 
  mode 0600 
end 
 
# Private key 
template "/home/vagrant/.ssh/id_rsa" do 
  source "id_rsa" 
  owner "vagrant" 
  mode 0600 
end

After that is taken care of, the only other minor caveat is that if you are cloning a huge repo then it might timeout unless you override the default timeout value, which is set to 600 seconds (10 mins).  I had some trouble finding this information on the docs but thanks to Seth Vargo I was able to find what I was looking for. This is easy enough to accomplish, just use the following snippet to override the default value

timeout 9999

That should be it.  There are probably other, easier ways to accomplish this and so I definitely think the adage “there’s more than one way to skin a cat” applies here.  If you happen to know another way I’d love to hear it.

Read More

Protip January: Get your external IP from the command line

Ever need to grab your IP quick but don’t want to get out of the command line or stop whatever you’re working on?  Or how about if you have SSH’d into a number of different servers and you simply want to know where you are at currently?  This little trick enables you to quickly determine your public IP address without leaving the command line.

I’ll admit, I didn’t originally come up with this one, but liked it so much that I decided to write a quick post about it because I thought it was so nice and useful. There is a great website called commandlinefu.com where users can post all their slick one liners, which is where I found this one.  If you haven’t been there before I highly recommend it, there is some really good stuff over there.

This one is simple yet quite useful, which is what I’m all about.  The command uses curl, so if you don’t have that bad boy installed yet you’ll need to go get that quick (Debian based distros).

sudo aptitude install curl

Once that is installed simply run the following:

curl ifconfig.me

And bam!  Emeril style.  Let that go out and do its thing and you will quickly have your external IP address.  I like this method a lot more than having to jump out of the shell and open up a browser then going to a website to get this information.  It might not save that much time but to me just knowing how to do this is useful and knowledge is power.  Or something.

Read More

Document storage: Part 6

Document Storage Project

This is Part 6: Tying it all together.

All that’s left to do now is write a script that will:

  • Detect when a new file’s been uploaded.
  • Turn it into a searchable PDF with OCR.
  • Put the finished PDF in a suitable directory so we can easily browse for it later.

This is actually pretty easy. inotifywait(1) will tell us whenever a file’s been closed, we can use that as our trigger to OCR the document.

Our script is therefore in two parts:

Part 1: will watch the /home/incoming directory for any files that are closed.
Part 2: will be called by the script in part 1 every time a file is created.

Part 1

This script lives in /home/scripts and is called watch-dir.

#!/bin/bash
INCOMING="/home/incoming"
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

inotifywait -m --format '%:e %f' -e CLOSE_WRITE "${INCOMING}"  2>/dev/null | while read LINE
do
        FILE="${INCOMING}"/`echo ${LINE} | cut -d" " -f2-`
        "${DIR}"/process-image "${FILE}" &
done

Part 2

This script lives in /home/scripts and is called process-image.

#!/bin/bash

# Dead easy - at least in theory!
# Take a single argument - filename of the file to process. 
# Do all the necessary processing to make it a 
# searchable PDF.

OUTFILE="`basename "${1}"`"
TEMPFILE="`mktemp`"

if [ -s "${1}" ]
then
	# We use the first part of the filename as a classification.
	CLASSIFICATION=`echo ${OUTFILE} | cut -f1 -d"-"`
	OUTDIR="/home/http/documents/${CLASSIFICATION}/`date +%Y`/`date +%Y-%m`/`date +%Y-%m-%d`"

	if [ ! -d "${OUTDIR}" ]
	then
		mkdir -p "${OUTDIR}" || exit 1
	fi

	# We have to move our file to a temporary location right away because 
	# otherwise pdfsandwich uses the file's own location for 
	# temporary storage. Well and good - but the file's location is 
	# subject to an inotify that will call this script!

	mv "${1}" "${TEMPFILE}" || exit 1

	# Have we a colour or a mono image? Probably quicker to find out 
	# and process accordingly rather than treat everything as RGB.
	# We assume the first page is representative of everything
        COLOURDEPTH=`convert "${TEMPFILE}[0]" -verbose -identify /dev/null 2>/dev/null | grep "Depth:" | awk -F'[/-]' '{print $2}'`
	if [ "${COLOURDEPTH}" -gt 1 ]
	then
		SANDWICHOPTS="-rgb"
	fi
	pdfsandwich ${SANDWICHOPTS} -o "${OUTDIR}/${OUTFILE}" "${TEMPFILE}" > /dev/null 2>&1
	rm "${TEMPFILE}"
fi

There’s just one thing missing: pdfsandwich. This is actually something I found elsewhere on the web. It hasn’t made it into any of the major distro repositories as far as I can tell, but it’s easy enough to compile and install yourself. Find it here.

Run /home/scripts/watch-dir every time we boot – the easiest way to do this is to include a line in /etc/rc.local that calls it:

/home/scripts/watch-dir &

Get it started now (unless you were planning on rebooting):

nohup /home/scripts/watch-dir &

Now you should be able to scan in documents, they’ll be automatically OCR’d and made available on the internal website you set up in part 3.

Further enhancements are left to the reader; suggestions include:

  • Automatically notifying sphider-plus to reindex when a document is added. (You’ll need a newer version of sphider-plus to do this. Unfortunately there is a cost associated with this, but it’s pretty cheap. Get it from here).
  • There is a bug in pdfsandwich (actually, I think the bug is probably in tesseract or hocr2pdf, both of which are called by pdfsandwich): under certain circumstances which I haven’t been able to nail down, sometimes you’ll find that in the finished PDF one page of a multi-page document will only show the OCR’d layer, not the original document. Track down this bug, fix it and notify the maintainer of the appropriate package so that the upstream package can also be fixed.
  • This isn’t terribly good for bulk scanning – if you want to scan in 50 one-page documents, you have to scan them individually otherwise they’ll be treated as a single 50 page document. Edit the script so we can somehow communicate with it that certain documents should be split into their constituent pages and store the resulting PDFs in this way.
  • Like all OCR-based solutions, this won’t give you a perfect representation of the source text in the finished PDF. But I’m quite sure the accuracy can be improved, very likely without having to make significant changes to how this operates. Carry out some experiments to figure out optimum settings for accuracy and edit the scripts accordingly.

Read More

Introduction to Weechat

Continuing our little mini series I wanted to introduce a great alternative to Irssi, called Weechat.  In this post we will go over how to get it set up and configured as closely to the way we had our workflow set up in Irssi.  This way you can evaluate both of these IRC clients for yourself and make the determination of which will suit you best for your needs.  If you haven’t taken a look at the previous posts they sort of build off of each other but if you want to compare Weechat and Irssi just take a look at Introduction to Irssi.

I’m not going to lie, the more I use Weechat the more it grows on me.  It is clean, easy to use and has some great functionality built in to it that isn’t offered out of the box in Irssi.  Small things for the most part, such as a native nicklist, colored nicks and some slick formatting to name a few.  It definitely looks and feels nice right away, without the need for customization.  One convincing argument some have mentioned for switching over to Weechat is that nearly everything is scriptable in a variety of different languages as well as very customizable.  Another nicety is the very slick script manager that you gives you the ability to install and manage different scripts and plugins without any hassle, which I will discuss later in this post.

Another cool thing I learned is that the writer/creator of the program hangs out in the #weechat IRC channel on freenode pretty much all the time and will answer questions for you!  How cool is that?  I had some issues getting the buffers.pl script set up the way I wanted and (@FlashCode) was there to immediately guide me in the right direction.

Bitlbee in Weechat

Just a quick word here.  I noticed that Bitlbee behaves slightly differently in Weechat so if you are used to Irssi then you should read about how to get things working.  Nothing too major here, just a few peculiarities that I thought were important to include.  I have listed below some of the most common things to go over in Bitlbee paired with Weechat.

Connect to bitlbee.

/connect localhost

Automatically connect to Bitlbee when Weechat starts.

/server add &bitlbee localhost -autoconnect

Add your gtalk account.

account add jabber [email protected]
acc 0 set password <password>
acc 0 on

Change nick to more readable form (restart for this to take effect).

acc 0 set nick_source full_name

Connect to gtalk on start (should work but still need to fix this).

/set irc.server.bitlbee.command "/msg &bitlbee identify <password>"

I couldn’t get oauth working, I will come back and update this later.

Getting used to Weechat

Now that we have all of that fun stuff out of the way we are finally ready for the meat of this post.  Let’s go ahead and install and fire up Weechat.

sudo aptitude install weechat
weechat-curses

Here is how to set some your defaults in Weechat, most of these are pretty intuitive.

/server add freenode irc.freenode.net
/set aspell.check.enabled on
/set irc.server.freenode.nicks "username, username_"
/set irc.server.freenode.username "username"
/set irc.server.freenode.realname "first and last"
/set irc.server.freenode.autoconnect on

Identify nick on server after connecting.

/set irc.server.freenode.command "/msg nickserv identify <password>"

Autojoin favorite channels after connecting to your IRC server.

/set irc.server.freenode.autojoin "##/r/sysadmin,#channel2"

Connect to your newly created IRC alias.

/connect freenode

As you can see, out of the box Weechat offers some very nice features.  The only thing that I found annoying/frustrating in Weechat was that there was no way to manage your different windows.  Remember, in Irssi this was done with the addition of the adv_windwolist.pl script.  In Weechat things are a tad different but there is a script to manage these windows (they are referred to as buffers in Weechat).

Installing the Weeget script manager (This has been deprecated)

Instead of using the “weeget.py” method use “script”.  It is baked in now so installing plugins is super easy now.

For a list of all available plugins visit the Weechat script page.

The first step is to get a nice little script that will allow us to get this script and others.  It is called weeget and should be installed first.  I HIGHLY suggest looking at getting this up and working, it will save you pain and misery down the road, trust me.

cd ~/.weechat/python
wget http://www.weechat.org/files/scripts/weeget.py

We need to restart Weechat for this script get picked up and to take effect.

Install buffers.pl

Once that is done, we will install our new window manager script with the following command inside Weechat.

/script install buffers

This will give you a nice pretty list of all your open windows and conversations on the left side of Weechat.

That’s okay but we can make this better!  Here is how to stick the the bar to either the top or the bottom of you console and to fill with columns.

/set weechat.bar.buffers.position top (or bottom)
/set weechat.bar.buffers.filling_top_bottom columns_vertical

Install iset

Another slick script to install is the iset.pl script.  This makes changing settings much easier as it adds a lightweight interface and description to all the different options and settings.  Installation is easy once we have weeget installed.

/script install iset

There is one last script that users may like, called buffer_autoclose.py.  This will close any inactive sessions that happen to remain open that don’t need to be and will begin working automatically once it is installed.  This will clean up the buffer list and just helps to improve the look and feel.  Super easy to install with weeget.

Install buffer_autoclose

/script install buffer_autoclose.py

And here is what our final product looks like.  Slightly different than irssi and I haven’t really gotten into custom themes or any of that jazz but it has really begun to shape up.

If you have any tips or tricks on how to improve this environment let me know.  I am new to Weechat and am still discovering all of its nuances and will be coming back to this post periodically to update things that I have found to be worthy of adding.

Resources:
http://leigh.cudd.li/article/Bitlbee_and_Weechat_Mini-Tutorial
http://kmacphail.blogspot.com/2011/09/using-weechat-with-freenode-basics.html
http://www.weechat.org/files/doc/devel/weechat_user.en.html#screen_layout

Read More

Introduction to Irssi

If you have been a follower this blog, I wrote a post awhile back that described my preferred settings in tmux and just recently wrote a post about getting set up with bitlbee.  Today we will be adding on another piece to what I will be calling my ultimate command line theme by introducing another useful command line tool for communications called Irssi.  Here are the back posts to these if you missed them.

Now that we have that out of the way let’s talk about Irssi.  Irssi is a console based IRC client that has been around for quite a while now.  There is somewhat of a debate holy war as to which console based IRC client is the best.  There are a number of hardcore Irssi users around that tout it is as the best, with the likewise being true for Weechat fans.  Before going any further I will say that there is definitely a certain amount of leg work to get Irssi up and running with the full set of customizations and features. That said, I believe the extra work is worth every minute of time and effort if you are looking for a fully featured, rich IRC experience.

I want to present both of these clients (Irssi and Weechat) to readers and let each person decide for themselves which is the best, because saying one is better than the other wouldn’t be a fair comparison, and is really like comparing apples and orages.  With that said, in a future post I will be going over the basics of using Weechat, the other touted console based IRC client.

Bitlbee in Irssi

Add an alias for Bitlbee.

/network add bitlbee
/server add -auto -network bitlbee localhost
/connect bitlbee

Register server account to tie to Irssi.

register
/oper
<desired password>

Automatically join and identify when Irssi starts.

/channel add -auto -botcmd '/say identify\; /oper' &bitlbee bitlbee

Add in your Gtalk account.

account add jabber [email protected]
/oper
<gmail password>

Set up correct port and ssl stuff for gtalk.

account jabber server talk.google.com:5223:ssl

Getting used to Irssi

Here we will assume that you have created and set up a user with irc.freenode.net.  Once that step has been completed you should be able to follow these instructions without any issues.

/SERVER ADD -auto -network freenode irc.freenode.net

You may have to shutdown and restart Irssi at this point for it to recognize the network name “freenode” in the next step.

/CHANNEL ADD -auto ##/r/sysadmin freenode

Adding advanced_windowlist.pl

First we need to download the script and put it into the appropriate place.  If you haven’t created your Irssi scripts directory and your autorun directory go ahead and make them quickly.

mkdir ~/.irssi/scripts
mkdir ~/.irssi/scripts/autorun

Change directories to your scripts directory and download the script.

cd ~/.irssi/scripts
wget http://anti.teamidiot.de/static/nei/*/Code/Irssi/adv_windowlist.pl

Let’s quickly set it to be executable.

sudo chmod +x adv_windowlist.pl

Now we need to symlink this script and then run it in Irssi. To symlink it run the following,

cd ~/.irssi/scripts/autorun
ln -s ../adv_windowlist.pl

And finally to load it into Irrsi.

/run adv_windowlist.pl

That should be it. This can come in handy when you have any more than a handful of windows and can’t keep your conversations straight. If we take a look at our Irssi session we can see that there is a name associated with each window number now.

As you can see there is now a name associated with each of the windows that we have open.  This looks pretty good but there are some cool features in this script that we are going to leverage to make it look even better.  In your Irssi session run the following commands to customize your display even further,

/statusbar window remove act
/set awl_display_key $Q%K|$N%n $H$C$S
/set awl_display_key_active $Q%K|$N%n $H%U$C%n$S
/set awl_display_nokey [$N]$H$C$S
/set awl_block 1

OK, this is looking better. We now have our current conversation underlined, our windows named and numbered with decent formatting and have set windows with activity to update and change colors. There are more options if you look at the script itself but this is a pretty good start.

Setting up hilight.pl

This script will add in the ability to check messages that contain your nick. This is a good way to easily check messages while you were away or didn’t get a chance to respond to.  First we need to make sure that the new window will split correctly.

SET autostick_split_windows ON
/hilight <nickname>

Now we add in and configure our new notification window.

/window new split
/window name hilight
/window size 4

Set up nm.pl

The description from its creator is “right aligned nicks depending on longest nick”.  This script will help with the readability and organization of your different chats.  I’m not sure if it requires nickcolor.pl but I have it in my scripts folder and symlinked to my autorun folder just in case.  Just load nm.pl in like you do for all other scripts and it will start doing its things.

/run nm.pl

Setting up themes

Definitely not a necessity but can help to make things cleaner and easier to read.  So far I have played around with the xchat and fear2 themes but will come back and update this post if I happen to find a better theme.  The good thing is that thems are really easy to set up and use.  So to load a specific theme just copy it into your /.irssi directory and turn it on in Irssi.

wget http://irssi.org/themefiles/xchat.theme
/set theme xchat

That’s all I have on Irssi for now.  If there is one complaint that I have about Irssi it is that the nicklist.pl script doesn’t play nicely in tmux (however it should be fine in screen).  It is a manual process and is a pain in the ass to get set up so I have chosen not to cover it in this post.  It is possible I know, but for me, it just wasn’t worth the trouble.  If you know of an easy way to get this working inside of tmux let me know.

Resources:
http://irssi.org/beginner/
http://quadpoint.org/articles/irssi/#channel_statusbar_using_advanced_windowlist
http://www.antonfagerberg.com/archive/my-perfect-irssi-setup
http://quadpoint.org/articles/irssisplit/

Read More