Docker for Mac file system performance summary

One of the more controversial topics right now in the Docker community is the issue surrounding file system performance in the Docker for Mac application.

For a very long time users have been forced to use workarounds to speed up performance when dealing with slow read and write times.  For example, this thread has been open on the Docker forums for over a year now, describing the problem and various workarounds users have found during that time.  There have been blog posts describing various optimizations, as well as scripts and tools to alleviate some of the frustration around slow file system performance on Docker for Mac.

There is a great explanation from the Docker team that lays out the details of the file system performance issues and what the crux of the problem is right now.

At the highest level, there are two dimensions to file system performance: throughput (read/write IO) and latency (roundtrip time). In a traditional file system on a modern SSD, applications can generally expect throughput of a few GB/s. With large sequential IO operations, osxfs can achieve throughput of around 250 MB/s which, while not native speed, will not be the bottleneck for most applications which perform acceptably on HDDs.

The article later goes on to highlight the plan to improve performance along with a number of specific items for accomplishing this.

Under development, we have:

  1. A Linux kernel patch to reduce data path latency by 2/7 copies and 2/5 context switches
  2. Increased OS X integration to reduce the latency between the hypervisor and the file system server
  3. A server-side directory read cache to speed up traversal of large directories
  4. User-facing file system tracing capabilities so that you can send us recordings of slow workloads for analysis
  5. A growing performance test suite of real world use cases (more on this below in What you can do)
  6. Experimental support for using Linux’s inode, writeback, and page caches
  7. End-user controls to configure the coherence of subsets of cross-OS bind mounts without exposing all of the underlying complexity

Additionally, with the latest release of the Docker for Mac 17.04-ce-mac7 (April 6 2017) client, a new :cached flag has been introduced for volume mounts to help with read times for lots of files.  There is also work going on to introduce another :delegated flag to help speed up write times.

Initial user testing of the :cached flag has been good, and shown up to a 4x improvement in some cases.  You can follow this issue on Github to get the most up to date information.  There is some really good detail and discussion going on over there (towards the bottom of the issue is where the new flags are discussed).

Overall I think Docker has done a great job of keeping users informed and updated on the various aspects of the problem and has been steadily making progress in addressing the situation.  The container ecosystem is still very young so there will be growing pains along the way and I think the way that Docker has been handling things has been more than reasonable as they have consistently been making progress on addressing the issue and have been transparent in recent months about what’s going on and how they’re working on the problem.

Read More

Defaulting Google search results to the past year

If you spend as much time looking around the internet for answers to obscure questions as I do, you quickly realize that often times, Google will happily present you with results that are many years old and out of date.  This is especially frustrating when you eventually find an answer after spending quite some time searching, only to realize the answer you’re looking for is 5 years old.  For more reading on efficient searching check out this post on the 10 tab rule, which has some useful ideas for better searching in general.

xkcd

There is a trick that will allow you to customize the Google default results.  The key to mapping the Google search terms is really just the URL.  Google uses search parameters for querying, so you can do some really cool things.  Obviously this is a powerful concept, so a lot of useful searches can be mapped.  This idea can be taken further to map keywords to other searches, like YouTube, Google maps, Stack Overflow, etc, or basically any site that provides a search interface.

I have only tested these key/search mappings for Google search results on Google Chrome, so if you use another browser there might be a similar trick, I just haven’t attempted it.  Open Chrome settings and navigate to the “Search” section.

This will pull up a dialog box with a list of default search engines.  Scroll to the bottom of the list and add the following values to the corresponding fields from the screenshot.

The Keyword is just a “>” symbol, and it can be anything really.  I chose the > because it is quick and easy to get to.  The rest is pretty self explanatory.  After the entry has been added, scroll through the search engines and find the new “Google recent” entry – there is a button that says “Make default” if you hover over the search engine entry.  Click that and then click done.

Now when you do a Google search from your search bar it will default to items from the past year.

Bonus

You can extend this trick further to map keys in your Google search bar to do other searches.  For example, you can map a key (or word) to search for image results.  In the below example, I am using “I” as the keyword.

After adding the above snippet into the search settings you can navigate to the search bar, type in I (followed by a tab) and the term to search for and it will automatically do an image search.  The key to making the mapping being a tab completion in the search bar is the q=%s part in the URL.

The last bonus search that has worked for me is the “feeling lucky” search.

That keyword (I used a tilde) can once again be anything, but preferably should be fast to get reach to make searching easier.

One final note

Sometimes you actually do want to search for results that are over a year old.  This is true of information that doesn’t really change often.  So if you are having trouble finding a website you think should be at the top of the search, make sure that the default search result is set to any time.  Ideally you would make another key mapping to handle this searching behavior.

Read More

awesome-rancher

Awesome lists are a newish phenomenon for organizing links and information that have been gaining popularity on Github.  You can read more about awesome lists here, but essentially they are curated lists of helpful links and other various resources that are put together by members of their communities.

There are lots of awesome lists these days on Github, and there are even awesome lists for non-tech categories, including recipes and video games. Oddly enough though through my own searching, I quickly discovered that there was no awesome list for Rancher.

So I decided to give a little bit back to the Rancher community in my own way.  If you aren’t aren’t familiar with Rancher, it is essentially a platform and orchestration engine for container based workloads.  Out of all of the container management and orchestration platforms that have been growing out of the container popularity explosion, using Rancher has definitely the most pleasant experience and easiest to use.  Also, if you are new to Rancher, please go check out the list on Github for more information, that is what it is there for.

Check out the awesome-rancher project here.

The goal for me and for the project is to make awesome-rancher the go to resource for finding useful information.  Therefore, I want awesome-rancher to be as easy to use and as complete as possible so new users have a good jumping off place when they decide to dive into Rancher.

If you are a Rancher community member or even just notice a key resource missing from the list, please either let me know with a comment here, on twitter/facebook, or just open a PR on the project itself (which is probably the easiest).

Read More

Configure a Rancher HAProxy health check

If you are familiar with ELB/ALB you will know that there are slight idiosyncrasies between the two.  For example, ELB allows you to health check a back end server by TCP port.  Basically allowing the user to check if a back end comes up and is listening on a specified port.  ALB is slightly different in its method for health checking.  ALB uses HTTP checks (layer 7) to ensure back end instances are up and listening.

This becomes a problem in Rancher, when you have multiple stacks in a single environment that are fronted by the Rancher HAProxy load balancer.  By default, the HAProxy config does not have a health check endpoint configured, so ALB is never able to know if the back end server is actually up and listening for requests.

A colleague and I  recently discovered a neat trick for solving this problem if you are fronting your environment with an ALB.  The solution to this conundrum is to sprinkle a little bit of custom configuration to the Rancher HAProxy config.

In Rancher, you can modify the live settings without downtime.  Click on the load balancer that sits behind the ALB and navigate to the Custom haproxy.cfg tab.

haproxy config

Modify the HAProxy config by adding the following:

# Use to report haproxy's status
defaults
    mode http
    monitor-uri /_ping

Click the “Edit” button to apply these changes and you should be all set.

Next, find the health check configuration for the associated ALB in the AWS console and add a check the the /_ping path on port 80 (or whichever port you are exposing/plan to listen on).  It should look similar to the following example.

Health checks

Below is an example that maps a DNS name to an internal Nginx container that is listening for requests on port 80.

HAPRoxy configuration

The check in ALB ensures that the HAProxy load balancer in Rancher is up and running before allowing traffic to be routed to it.  You can verify that your Rancher load balancer is working if the instances behind your ALB start showing a status of healthy in the AWS console.

NOTE: If you don’t have any apps initially behind the Rancher load balancer (or that are listening on the port specified in the health check) the AWS instances behind ALB will remain unhealthy until you add configuration in Rancher for the stacks to be exposed, as pictured above.

After setting up HAProxy, publicly accessible services in private Rancher environments can easily be managed by updating the HAProxy config.  Just add a dns name and a service to link to and HAProxy is able to figure out how and where to route requests to.  To map other services that aren’t listening on port 80, the process is  very similar.  Use the above as a guideline and simply update the target port to whichever port the app is listening on internally.

Read More

Powershell for Linux!

Microsoft has been making a lot of inroads in the Open Source and Linux communities lately.  Linux and Unix purists undoubtedly have been skeptical of this recent shift.  Canonical for example, has caught flak for partnering with Microsoft recently.  But the times are changing, so instead of resenting this progress, I chose to embrace it.  I’ll even admit that I actually like many of the Open Source contributions Microsoft has been making – including a flourishing Github account, as well as an increasingly rich, and cross platform platform set of tools that includes Visual Studio Code, Ubuntu/Bash for Windows, .NET Core and many others.

If you want to take the latest and greatest in Powershell v6 for a spin on a Linux system, I recommend using a Docker container if available.  Otherwise just spin up an Ubuntu (14.04+) VM and you should be ready.  I do not recommend trying out Powershell for any type of workload outside of experimentation, as it is still in alpha for Linux.  The beta v6 release (with Linux support) is around the corner but there is still a lot of ground that needs to be covered to get there.  Since Powershell is Open Source you can follow the progress on Github!

If you use the Docker method, just pull and run the container:

docker run -it --rm ubuntu:16.04 bash

Then add the Microsoft Ubuntu repo:

# apt-transport-https is needed for connecting to the MS repo
apt-get update && apt-get install curl apt-transport-https
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list | tee /etc/apt/sources.list.d/microsoft.list

Update and install Powershell:

sudo apt-get update
sudo apt-get install -y powershell

Finally, start up Powershell:

powershell

If it worked, you should see a message for Powershell and a new command prompt:

# powershell
PowerShell
Copyright (C) 2016 Microsoft Corporation. All rights reserved.

PS />

Congratulations, you now have Powershell running in Linux.  To take it for a spin, try a few commands out.

Write-Host "Hello Wordl!"

This should print out a hello world message.  The Linux release is still in alpha, so there will surely be some discrepancies between Linux and Windows based systems, but the majority of cmdlets should work the same way.  For example, I noticed in my testing that the terminal was very flaky.  Reverse search (ctrl+r) and the Get-History cmdlet worked well, but arrow key scrolling through history did not.

You can even run Powershell on OSX now if you choose to.  I haven’t tried it yet, but is an option for those that are curious.  Needless to say, I am looking for the

Read More