Turbocharge your Ansible Playbooks

If you haven’t already discovered Mitogen yet, read on for how to use it (and a few other tricks) to make you Ansible plays a much better experience.

In short, Mitogen is a Python library that (among other things) provides an alternative way to connect to distributed machines using tools like Ansible, Salt and Fabric. And it is fast. Like really fast. Here is a note taken from the Mitogen documentation.

Expect a 1.25x – 7x speedup and a CPU usage reduction of at least 2x, depending on network conditions, modules executed, and time already spent by targets on useful work. Mitogen cannot improve a module once it is executing, it can only ensure the module executes as quickly as possible.

As the documentation says, Mitogen isn’t intended to be used directly but has entrypoints for connecting various tools with its API.

Here is what the sample output might look like with the SSH pipelining and other tweaks configured, not including Mitogen. It clocks in around 120 seconds.

And here is the same run again with Mitogen. The same playbook run is down to around 90 seconds, about a 25% improvement as shown below. The output and few of the other settings are described in more detail below.

mitogen output
better Ansible play output

To set up Mitogen as an Ansible replacement for connecting to hosts, first install it. Note the version. In my own testing, versions earlier than 0.2.5 had some issues.

cd </path/to/install>
curl -OL https://files.pythonhosted.org/packages/source/m/mitogen/mitogen-0.2.5.tar.gz
tar xvzf mitogen-0.2.5.tar.gz

Then modify the anisble.cfg file to point at Mitogen.

[defaults]
strategy_plugins = </path/to/install>/mitogen-0.2.4/ansible_mitogen/plugins/strategy
strategy = mitogen_linear

An option was addin in Mitogen v0.2.4 to disable SSH compression, which can reduce run times in faster networks. The documentation says this option will be default in the future but for now you can turn it on with the following command configuration.

mitogen_ssh_compression = False

NOTE: If you are having trouble with Mitogen and need to turn it off you should also be aware of SSH pipelining. This method of execution isn’t as fast as Mitogen but should at least help bring playbook times down. You can turn it on with the following configuration.

[ssh_connection]
pipelining = True

There are a few other bells and whistles that you can adjust in the anisble.cfg file to help with performance and gain visibility into what is happening.

There is a setting for callback configurations that can be added to ansible.cfg that makes it much easier to see how long things take.

...
# Record some metrics about the Ansible runs
callback_whitelist = timer, profile_tasks
# Better output formatting
stdout_callback = yaml
# Minimal output formatting
#stdout_callback = minimal
callback_plugins = callback_plugins
...

Other settings that can be tuned include some of the defaults like poll_interval, caching and the number of forks to run. I found this blog post to be very helpful in discovering and describing a number of these Ansible tweaks.

Below is a modified ansible.cfg with these settings tuned.

# How often Ansible checks running tasks. The default is set to 15
poll_interval = 5

# Number of processes to fork.  Default is set to 5.
forks = 100

#caching
fact_caching            = jsonfile
fact_caching_connection = .cache/

With these tweaks your Ansible playbooks should run much faster and more cleanly. I highly recommend giving Mitogen a try as well, I have not run into any issues with Mitogen 0.2.3 and it isn’t much effort to add for the amount of gains you get by switching to it. If you know of any other tweaks or settings feel free to let me know!

Read More

Kubernetes Tips and Tricks

I have been getting more familiar with Kubernetes in the past few months and have uncovered some interesting capabilities that I had no idea existed when I started, which have come in handy in helping me solve some interesting and unique problems.  I’m sure there are many more tricks I haven’t found, so please feel free to let me know of other tricks you may know of.

Semi related; if you haven’t already checked it out, I wrote a post awhile ago about some of the useful kubectl tricks I have discovered.  The CLI has improved since then so I’m sure there are more and better tricks now but it is still a good starting point for new users or folks that are just looking for more ideas of how to use kubectl.  Again, let me know of any other useful tricks and I will add them.

Kubernetes docs

The Kubernetes community has somewhat of a love hate relationship with the documentation, although that relationship has been getting much better over time and continues to improve.  Almost all of the stuff I have discovered is scattered around the documentation, the main issue is that it a little difficult to find unless you know what you’re looking for.  There is so much information packed into these docs and so many features that are tucked away that aren’t obvious to newcomers.  The docs have been getting better and better but there are still a few gaps in examples and general use cases that are missing.  Often the “why” of using various features is still sometimes lacking.

Another point I’d like to quickly cover is the API reference documentation.  When you are looking for some feature or functionality and the main documentation site fails, this is the place to go look as it has everything that is available in Kubernetes.  Unfortunately the API reference is also currently a challenge to use and is not user friendly (especially for newcomers), so if you do end up looking through the API you will have to spend some time to get familiar with things, but it is definitely worth reading through to learn about capabilities you might not otherwise find.

For now, the best advice I have for working with the docs and testing functionality is trial and error.  Katacoda is an amazing resource for playing around with Kubernetes functionality, so definitely check that out if you haven’t yet.

Simple leader election

Leader election built on Kubernetes is really neat because it buys you a quick and dirty way to do some pretty complicated tasks.  Usually, implementing leader election requires extra software like ZooKeeper, etcd, Consul or some other distributed key/value store for keeping track of consensus, but it is built into Kubernetes, so you don’t have much extra work to get it working.

Leader election piggy backs off the same etcd Kubernetes uses as well as Kubernetes annotations, which give users a robust way to do distributed tasks without having to recreate the wheel for doing complicated leader elections.

Basically, you can deploy the leader-elector as a sidecar with any app you deploy.  Then, any container in the pod that’s interested in who is the master can can check by visiting the http endpoint (localhost:4044 by default) and they will get back some json with the current leader.

Shared process namespace across namespaces

This is a beta feature currently (as of 1.13) so is enabled now by default.  This one is interesting because it allows you to share a PID between containers.  Unfortunately the docs don’t really tell you why this feature is useful.

Basically, if you add shareProcessNamespace: true to your pod spec, you turn on the ability to share a PID across containers. This allows you to do things like changing a configuration in one container, sending a SIGHUP, and then reloading that configuration in another container.

For example, running a sidecars that controls configuration files or for reaping orphaned zombie processes.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  shareProcessNamespace: true
  containers:
  - name: nginx
    image: nginx
  - name: shell
    image: busybox
    securityContext:
      capabilities:
        add:
        - SYS_PTRACE
    stdin: true
    tty: true

Custom termination messages

Custom termination messages can be useful when debugging tricky situations.

You can actually customize pod terminations by using the terminationMessagePolicy which can control how terminations get outputted. For example, by using FallbackToLogsOnError you can tell Kubernetes to use container log output if the termination message is empty and the container exited with error.

Likewise, you can specify the terminationMessagePath spec to customize the path to a log file for specifying successes and failures when a pod terminates.

apiVersion: v1
kind: Pod
metadata:
  name: msg-path-demo
spec:
  containers:
  - name: msg-path-demo-container
    image: debian
    terminationMessagePath: "/tmp/my-log"

Container lifecycle hooks

Lifecycle hooks are really useful for doing things either after  a container has started (such as joining a cluster) or for running commands/code for cleanup when a container is stopped (such as leaving a cluster).

Below is a straight forward example taken from the docs that writes a message after a pod starts and sends a quit signal to nginx when the pod is destroyed.

apiVersion: v1
kind: Pod
metadata:
  name: lifecycle-demo
spec:
  containers:
  - name: lifecycle-demo-container
    image: nginx
    lifecycle:
      postStart:
        exec:
          command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
      preStop:
        exec:
          command: ["/usr/sbin/nginx","-s","quit"]

Kubernetes downward API

This one is probably more known, but I still think it is useful enough to add to the list.  The downward API basically allows you to grab all sorts of useful metadata information about containers, including host names and IP addresses.  The downward API can also be used to retrieve information about resources for pods.

The simplest example to show off the downward API is to use it to configure a pod to use the hostname of the node as an environment variable.

apiVersion: v1
kind: Pod
spec:
  containers:
    - name: test-container
      image: k8s.gcr.io/busybox
      command: [ "sh", "-c"]
      args:
      - while true; do
          echo -en '\n';
          printenv MY_NODE_NAME
          sleep 10;
        done;
      env:
        - name: MY_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName

Injecting a script into a container from a configmap

This is a useful trick when you want to add a layer on top of a Docker container but don’t necessarily want to build either a custom image or update an existing image.  By injecting the script as a configmap directly into the container you can augment a Docker image to do basically any extra work you need it to do.

The only caveat is that in Kubernetes, configmaps are by default not set to be executable.

In order to make your script work inside of Kubernetes you will simply need to add defaultMode: 0744 to your configmap volume spec. Then simply mount the config as volume like you normally would and then you should be able to run you script as a normal command.

...
volumeMounts:
- name: wrapper
mountPath: /scripts
volumes:
- name: wrapper
configMap:
name: wrapper
defaultMode: 0744
...

Using commands as liveness/readiness checks

This one is also pretty well known but often forgotten.  Using commands a health checks is a nice way to check that things are working.  For example, if you are doing complicated DNS things and want to check if DNS has updated you can use dig.  Or if your app updates a file when it becomes healthy, you can run a command to check for this.

readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5

Host aliases

Host aliases in Kubernetes offer a simple way to easily update the /etc/hosts file of a container.  This can be useful for example if a localhost name needs to be mapped to some DNS name that isn’t handled by the DNS server.

apiVersion: v1
kind: Pod
metadata:
  name: hostaliases-pod
spec:
  restartPolicy: Never
  hostAliases:
  - ip: "127.0.0.1"
    hostnames:
    - "foo.local"
    - "bar.local"
  containers:
  - name: cat-hosts
    image: busybox
    command:
    - cat
    args:
    - "/etc/hosts"

Conclusion

As mentioned, these are just a few gems that I have uncovered, I’m sure there are a lot of other neat tricks out there.  As I get more experience using Kubernetes I will be sure to update this list.  Please let me know if there are things that should be on here that I missed or don’t know about.

Read More

Deploy AWS SSM agent to CoreOS

If you have been a CoreOS user for long you will undoubtedly have noticed that there is no real package management system.   If you’re not familiar, the philosophy of CoreOS is to avoid using a package manager and instead rely heavily on leveraging the power of Docker containers along with a few system level tools to manage servers.  The problem that I just recently stumbled across is that the AWS SSM agent is packaged into debian and RPM formats and is assumed to be installed with a package manager, which obviously won’t work on CoreOS.  In the remainder of this post I will describe the steps that I took to get the SSM agent working on a CoreOS/Dockerized server.  Overall I am very happy with how well this solution turned out.

To get started, there is a nice tutorial here for using the AWS Session Manager through the the console.  The most important thing that needs to be done before “installing” the SSM agent on the CoreOS host is to set up the AWS instance with the correct permissions for the agent to be able to communicate with AWS.  For accomplishing this, I created a new IAM role and attached the AmazonEC2RoleForSSM policy to it through the AWS console.

After this step is done, you can bring up the ssm-agent.

Install the ssm-agent

After ensuring the correct permissions have been applied to the server that is to be manager, the next step is to bring up the agent.  To do this using Docker, there are some tricks that need to be used to get things working correctly, notably, fixing the PID 1 zombie reaping problem that Docker has.

I basically lifted the Dockerfile from here originally and adapted it into my own public Docker image at jmreicha/ssm-agent:latest.  In case readers want to go try this, my image is a little bit newer than the original source and has a few tweaks.  The Dockerfile itself is mostly straight forward, the main difference is that the ssm-agent process won’t reap child processes in the default Debian image.

In order to work around the child reaping problem I substituted the slick Phusion Docker baseimage, which has a very simple process manager that allows shells spawned by the ssm-agent to be reaped when they get terminated.  I have my Dockerfile hosted here if you want to check out how the phusion baseimage version works.

Once the child reaping problem was solved, here is the command I initially used to spin up the container, which of course still didn’t work out of the box.

docker run \
  -v /var/run/dbus:/var/run/dbus \
  -v /run/systemd:/run/systemd \
 jmreicha/ssm-agent:latest

I received the following errors.

2018-11-05 17:42:27 INFO [OfflineService] Starting document processing engine...
2018-11-05 17:42:27 INFO [OfflineService] [EngineProcessor] Starting
2018-11-05 17:42:27 INFO [OfflineService] [EngineProcessor] Initial processing
2018-11-05 17:42:27 INFO [OfflineService] Starting message polling
2018-11-05 17:42:27 INFO [OfflineService] Starting send replies to MDS
2018-11-05 17:42:27 INFO [LongRunningPluginsManager] starting long running plugin manager
2018-11-05 17:42:27 INFO [LongRunningPluginsManager] there aren't any long running plugin to execute
2018-11-05 17:42:27 INFO [HealthCheck] HealthCheck reporting agent health.
2018-11-05 17:42:27 INFO [MessageGatewayService] Starting session document processing engine...
2018-11-05 17:42:27 INFO [MessageGatewayService] [EngineProcessor] Starting
2018-11-05 17:42:27 INFO [LongRunningPluginsManager] There are no long running plugins currently getting executed - skipping their healthcheck
2018-11-05 17:42:27 INFO [StartupProcessor] Executing startup processor tasks
2018-11-05 17:42:27 INFO [StartupProcessor] Unable to open serial port /dev/ttyS0: open /dev/ttyS0: no such file or directory
2018-11-05 17:42:27 INFO [StartupProcessor] Attempting to use different port (PV): /dev/hvc0
2018-11-05 17:42:27 INFO [StartupProcessor] Unable to open serial port /dev/hvc0: open /dev/hvc0: no such file or directory
2018-11-05 17:42:27 ERROR [StartupProcessor] Error opening serial port: open /dev/hvc0: no such file or directory
2018-11-05 17:42:27 ERROR [StartupProcessor] Error opening serial port: open /dev/hvc0: no such file or directory. Retrying in 5 seconds...
2018-11-05 17:42:27 INFO [MessageGatewayService] Successfully created ssm-user
2018-11-05 17:42:27 ERROR [MessageGatewayService] Failed to add ssm-user to sudoers file: open /etc/sudoers.d/ssm-agent-users: no such file or directory
2018-11-05 17:42:27 INFO [MessageGatewayService] [EngineProcessor] Initial processing
2018-11-05 17:42:27 INFO [MessageGatewayService] Setting up websocket for controlchannel for instance: i-0d33006836710e7ef, requestId: 2975fe0d-846d-4256-9d50-57932be03925
2018-11-05 17:42:27 INFO [MessageGatewayService] listening reply.
2018-11-05 17:42:27 INFO [MessageGatewayService] Opening websocket connection to: %!(EXTRA string=wss://ssmmessages.us-west-2.amazonaws.com/v1/control-channel/i-0d33006836710e7ef?role=subscribe&stream=input)
2018-11-05 17:42:27 INFO [MessageGatewayService] Successfully opened websocket connection to: %!(EXTRA string=wss://ssmmessages.us-west-2.amazonaws.com/v1/control-channel/i-0d33006836710e7ef?role=subscribe&stream=input)
2018-11-05 17:42:27 INFO [MessageGatewayService] Starting receiving message from control channel
2018-11-05 17:42:32 INFO [StartupProcessor] Unable to open serial port /dev/ttyS0: open /dev/ttyS0: no such file or directory
2018-11-05 17:42:32 INFO [StartupProcessor] Attempting to use different port (PV): /dev/hvc0
2018-11-05 17:42:32 INFO [StartupProcessor] Unable to open serial port /dev/hvc0: open /dev/hvc0: no such file or directory
2018-11-05 17:42:32 ERROR [StartupProcessor] Error opening serial port: open /dev/hvc0: no such file or directory
2018-11-05 17:42:32 ERROR [StartupProcessor] Error opening serial port: open /dev/hvc0: no such file or directory. Retrying in 5 seconds...
2018-11-05 17:42:35 INFO [MessagingDeliveryService] [Association] No associations on boot. Requerying for associations after 30 seconds.

The first error that jumped out in logs is the “Unable to open serial port”.  There is also an error referring to not being able to add the ssm-user to the sudoers file.

The fix for these issues is to add a Docker flag to the CoreOS serial device, “–device=/dev/ttyS0” and a volume mount to the sudoers path, “-v /etc/sudoers.d:/etc/sudoers.d”.  The full Docker run command is shown below.

docker run -d --restart unless-stopped --name ssm-agent \
  --device=/dev/ttyS0 \
  -v /var/run/dbus:/var/run/dbus \
  -v /run/systemd:/run/systemd \
  -v /etc/sudoers.d:/etc/sudoers.d \
  jmreicha/ssm-agent:latest

After fixing the errors found in the logs, and bringing up the containerized SSM agent, go ahead and create a new session in the AWS console.

ssm session

The session should come up pretty much immediately and you should be able to run commands like you normally would.

The last thing to (optionally) do is run the agent as a systemd service to take advantage of some capabilities to start it up automatically if it dies or start it if the server gets rebooted.  You can probably just get away with using the docker restart policy too if you aren’t interested in configuring a systemd service, which is what I have chosen to do for now.

You could even adapt this Docker image into a Kubernetes manifest and run it as a daemonset on each node of the cluster if desired to simplify things and add another layer of security.  I may return to the systemd unit and/or Kubernetes manifest in the future if readers are interested.

Conclusion

session history

The AWS Session manager is a fantastic tool for troubleshooting/debugging as well as auditing and security.

With SSM you can make sure to never expose specific servers to the internet directly, and you can also keep track of what kinds of commands have been run on the server.  As a bonus, the AWS console helps keeps track of all the previous sessions that were created and if you hook up to Cloudwatch and/or S3 you can see all the commands and times that they were run with nice simple links to the log files.

SSM allows you to do a lot of other cool stuff like run scripts against either a subset of servers which can be filtered by tags or against all servers that are recognized by SSM.  I’m sure there are some other features as well, I just haven’t found them yet.

Read More

Multiarch Docker builds using Shippable

Recently I have been experimenting with different ways of building multi architecture Docker images.  As part of this process I wrote about Docker image manifests and the different ways you can package multi architecture builds into a single Docker image.  Packaging the images is only half the problem though.  You basically need to create the different Docker images for the different architectures first, before you are able to package them into manifests.

There are several ways to go about building the Docker images for various architectures.  In the remainder of this post I will be showing how you can build Docker images natively against arm64 only as well as amd64/arm64 simultaneously using some slick features provided by the folks at Shippable.  Having the ability to automate multi architecture builds with CI is really powerful because it avoids having to use other tools or tricks which can complicate the process.

Shippable recently announced integrated support for arm64 builds.  The steps for creating these cross platform builds is fairly straight forward and is documented on their website.  The only downside to using this method is that currently you must explicitly contact Shippable and requests access to use the arm64 pool of nodes for running jobs, but after that multi arch builds should be available.

For reference, here is the full shippable.yml file I used to test out the various types of builds and their options.

Arm64 only builds

After enabling the shippable_shared_aarch64 node pool (from the instruction above) you should have access to arm64 builds, just add the following block to your shippable.yml file.

runtime:
  nodePool: shippable_shared_aarch64

The only other change that needs to be made is to point the shippable.yaml file at the newly added node pool and you should be ready to build on arm64.  You can use the default “managed” build type in Shippable to create builds.

Below I have a very simple example shippable.yml file for building a Dockerfile and pushing its image to my Dockerhub account.  The shippable.yml file for this build lives in the GitHub repo I configured Shippable to track.

language: none

runtime:
  nodePool:
    - shippable_shared_aarch64
    - default_node_pool

build:

  ci:
    - sed -i 's|registry.fedoraproject.org/||' Dockerfile.fedora-28
    - docker build -t local/freeipa-server -f Dockerfile.fedora-28 .
    - tests/run-master-and-replica.sh local/freeipa-server

  post_ci:
    - docker tag local/freeipa-server jmreicha/freeipa-server:test
    - docker push jmreicha/freeipa-server:test

integrations:
  hub:
    - integrationName: dockerhub
      type: dockerRegistryLogin

Once you have a shippable.yml file in a repo that you would like to track and also have things set up on the Shippable side, then every time a commit/merge happens on the master branch (or whatever branch you set up Shippable to track) an arm64 Docker image gets built and pushed to the Dockerhub.

Docs for settings up this CI style job can be found here.  There are many other configuration settings available to tune so I would encourage you to read the docs and also play around with the various options.

Parallel arm64 and amd64 builds

The approach for doing the simultaneous parallel builds is a little bit different and adds a little bit more complexity, but I think is worth it for the ability to automate cross platform builds.  There are a few things to note about the below configuration.  You can use templates in either style job.  Also, notice the use of the shipctl command.  This tool basically allows you to mimic some of the other functionality that exists in the default runCI jobs, including the ability to login to Docker registries via shell commands and manage other tricky parts of the build pipeline, like moving into the correct directory to build from.

Most of the rest of the config is pretty straight forward.  The top level jobs directive lets you create multiple different jobs, which in turn allows you to set the runtime to use different node pools, which is how we build against amd64 and arm64.  Jobs also allow for setting different environment variables among other things.  The full docs for jobs shows all of the various capabilities of these jobs.

templates: &build-test-push
  - export HUB_USERNAME=$(shipctl get_integration_field "dockerhub" "username")
  - export HUB_PASSWORD=$(shipctl get_integration_field "dockerhub" "password")
  - docker login --username $HUB_USERNAME --password $HUB_PASSWORD
  - cd $(shipctl get_resource_state "freeipa-container-gitRepo")
  - sed -i 's|registry.fedoraproject.org/||' Dockerfile.fedora-27
  - sed -i 's/^# debug:\s*//' Dockerfile.fedora-27
  - docker build -t local/freeipa-server -f Dockerfile.fedora-27 .
  - tests/run-master-and-replica.sh local/freeipa-server
  - docker tag local/freeipa-server jmreicha/freeipa-server:$arch
  - docker push jmreicha/freeipa-server:$arch

resources:
    - name: freeipa-container-gitRepo
      type: gitRepo
      integration: freeipa-container-gitRepo
      versionTemplate:
          sourceName: jmreicha/freeipa-container
          branch: master

jobs:
  - name: build_amd64
    type: runSh
    runtime:
      nodePool: default_node_pool
      container: true
    integrations:
      - dockerhub
    steps:
      - IN: freeipa-container-gitRepo
      - TASK:
          runtime:
            options:
              env:
                - privileged: --privileged
                # Also look at using SHIPPABLE_NODE_ARCHITECTURE env var
                - arch: amd64
          script:
            - *build-test-push

  - name: build_arm64
    type: runSh
    runtime:
      nodePool: shippable_shared_aarch64
      container: true
    integrations:
      - dockerhub
    steps:
      - IN: freeipa-container-gitRepo
      - TASK:
          runtime:
            options:
              env:
                - privileged: --privileged
                - arch: arm64
          script:
            - *build-test-push

As you can see, there is a lot more manual configuration going on here than the first job.

I decided to use the top level templates directive to basically DRY the configuration so that it can be reused.  I am also setting environment variables per job to ensure the correct architecture gets built and pushed for the various platforms.  Otherwise the configuration is mostly straight forward.  The confusion with these types of jobs if you haven’t set them up before mostly comes from figuring out where things get configured in the Shippable UI.

Conclusion

I must admit, Shippable is really easy to get started with, has good support and has good documentation.  I am definitely a fan and will recommend and use their products whenever I get a chance.  If you are familiar with Travis then using Shippable is easy.  Shippable even supports the use of Travis compatible environment variables, which makes porting over Travis configs really easy.  I hope to see more platforms and architectures supported in the future but for now arm64 is a great start.

There are some downside to using the parallel builds for multi architecture builds.  Namely there is more overhead in setting up the job initially.  With the runSh (and other unmanaged jobs) you don’t really have access to some of the top level yml declarations that come with managed jobs, so you will need to spend more time figuring out how to wire up the logic manually using shell commands and the shipctl tool as depicted in my above example.  This ends up being more flexible in the long run but also harder to understand and get working to begin with.

Another downside of the assembly line style jobs like runSh is that they currently can’t leverage all the features that the runCI job can, including the matrix generation (though there is a feature request to add it in the future) and report parsing.

The last downside when setting up unmanaged jobs is trying to figure out how to wire up the different components on the Shippable side of things.  For example you don’t just create a runCI job like the first example.  You have to first create an integration with the repo that you are configuring so that shippable can make an rSync and serveral runSh jobs to connect with the repo and be able to work correctly.

Overall though, I love both of the runSh and runCI jobs.  Both types of jobs lend themselves to being flexible and composable and are very easy to work with.  I’d also like to mention that the support has been excellent, which is a big deal to me.  The support team was super responsive and helpful trying to sort out my issues.  They even opened some PRs on my test repo to fix some issues.  And as far as I know, there are no other CI systems currently offering native arm64 builds which I believe will become more important as the arm architecture continues to gain momentum.

Read More

Building k8s Manifests with Helm Templates

As I have started working more with Kubernetes lately I have found it very valuable to see what a manifest looks like before deploying it.  Helm can basically be used as a quick and dirty way to see what a rendered Helm template looks like.  This provides the security advantages of not running tiller in your production cluster if you choose to deploy the rendered templates locally.

Helm has been sort of a subject for contention for awhile now.  Security folks REALLY don’t like running the server side component because it basically allows root access into your cluster, unless it is managed a specific way, which tends to add much more complexity to the cluster.  There are plans in Helm 3 to remove the server side component as well as offering some more flexible configuration options that don’t rely on the Go templating, but that functionality not ready yet so I find rendering and deploying a nice middle ground for now.

At the same time, Helm does have some nice selling points which make it a nice option for certain situations.  I’d say the main draw to Helm is that it is ridiculously easy to set up and use, which is especially nice for things like local development or testing or just trying to figure out how things work in Kubernetes.  The other thing that Helm does that is difficult to do otherwise, is it manages deployments and versions and environments, although there have been a number of users that have had issues with these features.

Also check out Kustomize.  If you aren’t familiar, it is basically a tool for managing per environment customizations for yaml manifests and configurations.  You can get pretty far by rendering templates and overlaying kustomize on top of other configurations for managing different environments, etc.

Render a template (client side)

The first step to getting a working rendered template is to install the Helm client side component. There are installation instruction for various different platforms here.

brew install kubernetes-helm # (on OSX)

You will also need to grab some charts to test with.

git clone [email protected]:kubernetes/charts.git
cd charts/stable/metallb
helm template --namespace test --name test .

Below is an example with customized variables.

helm template --namespace test --name test --set controller.resources.limits.cpu=100m .

You can dump the rendered template to a file if you want to look at it or change anything.

helm template --namespace test --name test --set controller.resources.limits.cpu=100m . > helm-test.yaml

You can even deploy these rendered templates directly if you want to.

helm template --namespace test --name test --set controller.resources.limits.cpu=100m . | kubectl -f -

Render a template (server side)

Make sure tiller is running in the cluster first.  If you haven’t set up Helm on the server side before you basically set up tiller to run in the cluster.  Again, I would not recommend doing this on anything outside of a throw away or testing environment.  After the helm client has been installed you can use it to spin up tiller in the cluster.

helm init

Below is a basic example using the metallb chart.

helm install --namespace test --name test stable/metallb --dry-run --debug

Again, you can use customized variables.

helm install --namespace test --name test stable/metallb --set controller.resources.limits.cpu=100m --dry-run --debug

You may notice some extra configurations at the very beginning of the output.  This is basically just showing default values that get applied as well as things that have been customized by the user.  It is a quick way to see what kinds of things can be changed in the Helm chart.

Conclusion

Helm offers many other commands and options so I definitely recommend playing around with it and exploring the other things it can do.

I like to use both of these methods, but for now I just prefer to run a local tiller instance in a throwaway cluster (Docker for Mac) and pull in charts from the upstream repositories without having to git clone charts if I’m just looking at how the Kubernetes manifest configuration works.  You can’t really use the server side rendering though to actually deploy the manifests because it sticks a bunch of other information into the command output.

All in all the Helm templating is pretty powerful and combining it with something like kustomize should get you to around 90% of where you need to be, unless you are managing much more complex and complicated configurations.  The only thing that this method doesn’t lend itself very well to is managing releases and other metadata.  Otherwise it is a great way to manage configurations.

Read More