Setting up Git in PowerShell

It seems like everybody is using git these days.  And for most, not everybody is stuck using Windows in their day to day workflow.  Unfortunately, I am.  So that means it is much more painful to get up and running with a lot of the coolest and best open source projects that are offered by members of github and other online code repositories being shared via git.  However, there is hope and it is possible for Windows users to join the git party.  So in this post, I would like to describe just how to do that.  And it should only take a few minutes if done correctly.  I will mention beforehand that there are a few steps that need to be completed in order for this technique to work successfully that typically are taken care of in a Linux or OSX environment.

The goal of this post is to work through these steps as best I can to get users up and running as quickly as possible and as easily as possible, reducing the amount of confusion and fumbling around with settings.  This post is designed for beginners that are just getting their feet wet with git but hopefully others can use it as a resource if they are coming from a different environment and are confused by the Windows way of doing things.

First step – Download and install the git port for Windows.

This is pretty straight forward.  Download and run the executable to install git for Windows.  If you just want to get up and running or are lazy, you can leave all of the defaults when you run through the installation wizard.

Second step – Add the git binaries to your system path variable.

This is the most important step, because out of the box git won’t work in your ordinary PowerShell command prompt, it needs to be opened separately.  So to fix this and add all the necessary binaries open up your environmental variables (in Windows 8).

Computer -> Properties -> Advanced -> Environmental Variables

environmental variables

and add the following value to the PATH variable.

C:\Program Files\Git\bin

Here is what this should look like in Windows.

path variable

Third step (optional) – Download and install posh-git for better PowerShell and git integration.

I have highlighted part of this process before in an older post but will go through the steps again because it is pretty straight forward.  To be able to get posh-git you need to have a sort of PowerShell package management tool called PsGet (instructions here).  To get this tool run the following command from your PowerShell command prompt.

(new-object Net.WebClient).DownloadString("http://psget.net/GetPsGet.ps1") | iex

Once the command has completed you should be able to simply run this install command and be finished.

install-module posh-git

That should be it.  With these simple steps you should be able to utilize git from the command line like you are accustomed to on other operating systems.  As I said, there is a tad more leg work but you can really utilize the flexibility of PowerShell to get things working.  I hope it helps, and as always let me know if you have any tips or questions.

Read More

Quickly Find Exchange Database Usage

Here is a Powershell script you can use to quickly determine the total amount of space taken up by all of your Exchange database files (edb files) on an Exchange server.  I’d like to note that this may not necessarily be a 100% accurate representation but is a great way to get a ballpark number without having to add the numbers up yourself, manually.

$dbs = Get-MailboxDatabase -Status

foreach($db in $dbs) {

$edbsize = $db.DatabaseSize.Tobytes()
$totalsize += $edbsize

}

Write-Host $totalsize

I noticed that I had no way to calculate the total amount of space being used by my Exchange databases the other day.  And even after scouring through teh Googles I was unable to find what I was looking for quickly so I wrote this script up quick to fix that problem.  Just copy the previous bit of code into a ps1 file with notepad and execute the script from your EMS.  It is a super simple way to iterate through all the databases, save their sizes to a variable and then spit that variable out when it is complete.

Read More

Getting Python Fabric setup in Windows

This has really turned into a wild goose chase.  Initially my goal when I set out on this project was simply to get Fabric up and running so I could test out some different features on some network gear.  It seems like the Python integration in Windows is very different than it is in the Linux world where everything is all bundled up nice and neatly.  There are several separate, seemingly unrelated pieces that all need to fit together to get Python and Fabric working correctly in a Windows environment, which can be very perplexing at first, hence my need to write a post so I don’t have to remember all this complexity for next time.  I thought I might as well show people how I got this to work instead of picking and choosing different bits of information from the internet.

The following is a list of links that I have found to be helpful in getting everything up and going, flip back to here for the different resources and components:

There’s a few steps for getting up and running.  For basic Python functionality it should be enough to download and install Python via the basic installer in your Windows environment.  Accepting the defaults should be enough.  Also, I recommend going with Python 2.7, rather than 3.3 because it has much better backwards compatibility.  You will also want to double check to make sure you download the correct version for you OS as well, either 32-bit or 64-bit.

Once you have your Python install up and going you will want to get pip installed. You will use this tool to get Python modules because it aids tremendously with downloading, managing and installing useful Python code.

So to get up and running with pip, first make sure that you have the correctly matched version of Python and the pip installed for your environment.  For example the 2.7 pip installer will not work with a 3.3 Python installation.  Second, you will need to make sure you have the Distribute package installed in your Python environment as well.  This is the tool that will allow pip to work.  Once you have these modules installed you will need to switch to the directory where pip is installed (or add it to your ENV path variable).  For me it was located in the following location:

C:\Python27\Scripts\pip.exe

So the command to install Fabric would be as follows:

pip.exe install fabric

You would think that’s all you need to get fabric working right?  Well it turns out that using this method we do not have the correct version of Pycrypto installed.

pycrypto error

So using the link posted above go ahead and get the correct version of Pycrypto downloaded and installed (version 2.1.0).  That still doesn’t fix it though!  It just gets us to a different error.  I used this post and this post as a guide for getting the correct version of Pycrypto installed on the Windows machine.

Okay, so now we should have a fully functioning Python environment with Fabric installed.  The only main issue that remains at this point (to my knowledge at least) is that pip still doesn’t work quite right when attempting to install various Python packages.  To get that part working you will need MinGW32 installed (reference above for links).  But that is basically out of the scope of this post, I will write another post about it if there is any interest or you can ask me if you have issues as always.

The only other piece left then is to get Fabric up and going with our Cisco gear.  Take a look at the docs for basic usage on getting acquainted with Fabric, it is fairly straight forward for the most part.

One thing I was not aware of was the way Cisco CLI and devices would behave when using Fabric to control them remotely.  I was having issues with Fabric flaking out whenever I went into config mode on a Cisco switch.  It turns out that when you enter into config mode you are essentially dropped into a new shell and Fabric doesn’t have a nice way to deal with that.  So something like this will bomb out,

def test():
	run("conf t", shell=False)
	run("int 1/0/1", shell=False)
	run("no shut", shell=False)
	run("exit", shell=False)

The “conf t” command opens your new shell and the Cisco gear freaks out because it doesn’t know what to do with the next command.  I should also mention the shell=False is somewhat unrelated to this issue but it gets around Fabric trying to use bash as its default shell.  The workaround?  Use the open_shell command in Fabric and escape each command by using \n to escape to a new line.  So a sample command using this format would be something like the following,

def test():
	open_shell("conf t \n"
		   "ip name-server 1.1.1.1 \n"
		   "exit \n"
		   "exit \n"
		   )

Yeah this is sort of hacky, and I’m not sure if it will be able to do everything I am looking for but hey at least it kind of works.  I am currently looking for a more robust and easier way around this limitation so if you have any suggestions let me know.

Credit goes to markmm on reddit for letting me know about this workaround as well as the people who hang out on the #fabric irc channel on freenode.

Read More

Document storage: Part 6

Document Storage Project

This is Part 6: Tying it all together.

All that’s left to do now is write a script that will:

  • Detect when a new file’s been uploaded.
  • Turn it into a searchable PDF with OCR.
  • Put the finished PDF in a suitable directory so we can easily browse for it later.

This is actually pretty easy. inotifywait(1) will tell us whenever a file’s been closed, we can use that as our trigger to OCR the document.

Our script is therefore in two parts:

Part 1: will watch the /home/incoming directory for any files that are closed.
Part 2: will be called by the script in part 1 every time a file is created.

Part 1

This script lives in /home/scripts and is called watch-dir.

#!/bin/bash
INCOMING="/home/incoming"
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

inotifywait -m --format '%:e %f' -e CLOSE_WRITE "${INCOMING}"  2>/dev/null | while read LINE
do
        FILE="${INCOMING}"/`echo ${LINE} | cut -d" " -f2-`
        "${DIR}"/process-image "${FILE}" &
done

Part 2

This script lives in /home/scripts and is called process-image.

#!/bin/bash

# Dead easy - at least in theory!
# Take a single argument - filename of the file to process. 
# Do all the necessary processing to make it a 
# searchable PDF.

OUTFILE="`basename "${1}"`"
TEMPFILE="`mktemp`"

if [ -s "${1}" ]
then
	# We use the first part of the filename as a classification.
	CLASSIFICATION=`echo ${OUTFILE} | cut -f1 -d"-"`
	OUTDIR="/home/http/documents/${CLASSIFICATION}/`date +%Y`/`date +%Y-%m`/`date +%Y-%m-%d`"

	if [ ! -d "${OUTDIR}" ]
	then
		mkdir -p "${OUTDIR}" || exit 1
	fi

	# We have to move our file to a temporary location right away because 
	# otherwise pdfsandwich uses the file's own location for 
	# temporary storage. Well and good - but the file's location is 
	# subject to an inotify that will call this script!

	mv "${1}" "${TEMPFILE}" || exit 1

	# Have we a colour or a mono image? Probably quicker to find out 
	# and process accordingly rather than treat everything as RGB.
	# We assume the first page is representative of everything
        COLOURDEPTH=`convert "${TEMPFILE}[0]" -verbose -identify /dev/null 2>/dev/null | grep "Depth:" | awk -F'[/-]' '{print $2}'`
	if [ "${COLOURDEPTH}" -gt 1 ]
	then
		SANDWICHOPTS="-rgb"
	fi
	pdfsandwich ${SANDWICHOPTS} -o "${OUTDIR}/${OUTFILE}" "${TEMPFILE}" > /dev/null 2>&1
	rm "${TEMPFILE}"
fi

There’s just one thing missing: pdfsandwich. This is actually something I found elsewhere on the web. It hasn’t made it into any of the major distro repositories as far as I can tell, but it’s easy enough to compile and install yourself. Find it here.

Run /home/scripts/watch-dir every time we boot – the easiest way to do this is to include a line in /etc/rc.local that calls it:

/home/scripts/watch-dir &

Get it started now (unless you were planning on rebooting):

nohup /home/scripts/watch-dir &

Now you should be able to scan in documents, they’ll be automatically OCR’d and made available on the internal website you set up in part 3.

Further enhancements are left to the reader; suggestions include:

  • Automatically notifying sphider-plus to reindex when a document is added. (You’ll need a newer version of sphider-plus to do this. Unfortunately there is a cost associated with this, but it’s pretty cheap. Get it from here).
  • There is a bug in pdfsandwich (actually, I think the bug is probably in tesseract or hocr2pdf, both of which are called by pdfsandwich): under certain circumstances which I haven’t been able to nail down, sometimes you’ll find that in the finished PDF one page of a multi-page document will only show the OCR’d layer, not the original document. Track down this bug, fix it and notify the maintainer of the appropriate package so that the upstream package can also be fixed.
  • This isn’t terribly good for bulk scanning – if you want to scan in 50 one-page documents, you have to scan them individually otherwise they’ll be treated as a single 50 page document. Edit the script so we can somehow communicate with it that certain documents should be split into their constituent pages and store the resulting PDFs in this way.
  • Like all OCR-based solutions, this won’t give you a perfect representation of the source text in the finished PDF. But I’m quite sure the accuracy can be improved, very likely without having to make significant changes to how this operates. Carry out some experiments to figure out optimum settings for accuracy and edit the scripts accordingly.

Read More

Protip December: Customize your Powershell profile

Powershell is great but is a little boring if anything, out of the box, with its drab white foreground and all.  It isn’t exactly informative either, so I wanted to show everybody a quick trick to customize this look and feel to make things look a little bit cleaner.  Hopefully this introduction will demonstrate one of the many features that makes Powershell a great tool for Windows admins, which is its flexibility.

This customization file, called profile.ps1 can be located in one of two places.

  • The first location is the global location and would be useful when you want all users to have a customized Powershell profile.  This profile should be placed in C:\WINDOWS\system32\WindowsPowerShell\v1.0\Profile.ps1.
  • The second location is for the local profile and would be specific to each user account.  This file overrides the global configuration file and should be placed in C:\Username\My Documents\WindowsPowerShell\Profile.ps1.

By default these files don’t exist so you will have to navigate to the respective directory and create the initial, empty profile.ps1 file.

Once you have the file created just pop this chunk of code into your profile.ps1 file.

function prompt {
	$path = ""
	$pathbits = ([string]$pwd).split("\", [System.StringSplitOptions]::RemoveEmptyEntries)
	if($pathbits.length -eq 1) {
		$path = $pathbits[0] + "\"
	} else {
		$path = $pathbits[$pathbits.length - 1]
	}
	$userLocation = $env:username + '@' + [System.Environment]::MachineName + ' ' + $path
	$host.UI.RawUi.WindowTitle = $userLocation
    Write-Host($userLocation) -nonewline -foregroundcolor Green 

	Write-Host('>') -nonewline -foregroundcolor Green    
	return " "
}

Once you have this bit added you will need to reload Powershell and voila,

Customized powershell look
Here’s our customized Powershell look.

Much better.  As you can see we now have some additional, out of the box information including:

  • Our current username – jreichardt
  • Our computer name – JOSH-TEST
  • Our current directory – separated by the | (pipe) symbol
  • As well as a nice green foreground font to help with the readability.

Pretty simple to get set up but definitely adds a lot to the default look and feel.  Let me know if you have any other cool Powershell customization’s or tricks that you think are worth sharing.

Read More