Testing for Deprecated Kubernetes APIs

November 10, 2019November 11, 2019 Josh Reichardt 1 Comment

Kubernetes API changes are coming up and I wanted to make a quick blog post to highlight what this means and show a few of things I have discovered to deal with the changes.

First, there have been some relevant announcements regarding the changes and deprecations recently. The first being the API Depractions in 1.16 announcement, which describes the changes to the API and some of the things to look at and do to fix problems

The next post is the Kubernetes 1.16 release announcement, which contains a section “Significant Changes to the Kubernetes API” that references the deprecation post.

Another excellent resource for learning about how Kubernetes deprecations work is the API deprecation documentation, highlighted in the deprecation post, but not widely shared.

In my opinion, the Kubernetes community really dropped the ball in terms of communicating these changes and missed an opportunity to describe and discuss the problems that these changes will create. I understand that the community is gigantic and it would be impossible to cover every case, but to me, the few blog posts describing the changes and not much other official communication or guides for how to handle and fix the impending problems is a little bit underwhelming.

The average user probably doesn’t pay attention to these blog posts, and there are a lot of old Helm charts out in the wild still, so I’m confident that the incoming changes will create headaches and table flips when people start upgrading. As an example, if you have an old API defined and running in a pre 1.16 cluster, and upgrade without fixing the API version first, APPS IN YOUR CLUSTER WILL BREAK. The good news is that new clusters won’t allow the old API versions, making errors easier to see and deal with.

Testing for and fixing deprecated APIs

With that mini rant out of the way, there is a simple but effective way to test your existing configurations for API compatibility.

Conftest is a nice little tool that helps write tests against structured configuration data, using the Rego language using Open Policy Agent (OPA). Conftest works with many file types including JSON, TOML and HCL, which makes it a great choice for testing a variety of different configurations, but is especially useful for testing Kubernetes YAML configurations.

To get started, install conftest.

wget https://github.com/instrumenta/conftest/releases/download/v0.15.0/conftest_0.15.0_Linux_x86_64.tar.gz
tar xzf conftest_0.15.0_Linux_x86_64.tar.gz
sudo mv conftest /usr/local/bin

Then we can use the handy policy provided by the deprek8 repo to validate the API versions.

curl https://raw.githubusercontent.com/naquada/deprek8/master/policy/deprek8.rego > deprek8.rego
conftest test -p deprek8.rego sample/manifest.yaml

Here’s what a FAIL condition might look like according to what is defined in the rego policy file for an outdated API version.

FAIL - sample/manifest.yaml - Deployment/my-deployment: API extensions/v1beta1 for Deployment is no longer served by default, use apps/v1 instead.

The Rego policy is what actually defines the behavior that Conftest will display and as you can see, it found an issue with the Deployment object defined in the test manifest.

Below is the Rego policy that causes Conftest to spit out the FAILure message. The syntax is clean and easy to follow, so writing and adjusting policies is easy.

_deny = msg {
  resources := ["DaemonSet", "Deployment", "ReplicaSet"]
  input.apiVersion == "extensions/v1beta1"
  input.kind == resources[_]
  msg := sprintf("%s/%s: API extensions/v1beta1 for %s is no longer served by default, use apps/v1 instead.", [input.kind, input.metadata.name, input.kind])
}

Once you know what is wrong with the configuration, you can use the kubectl convert subcommand to fix up the existing deprecated API objects. Again, attempting to create objects using deprecated APIs in 1.16 will be rejected automatically by Kubernetes, so you will only need to deal with converting existing objects in old clusters being upgraded.

From the above error, we know the object type (Deployment) and the version (extensions/v1beta1). With this information we can run the convert command to fix the object.

# General syntax
kubectl convert -f <file> --output-version <group>/<version>

# The --output-version flag allows specifying the API version to upgrade to 
kubectl convert -f sample/manifest.yaml  --output-version apps/v1

# Omitting the --output-version flag will convert to the latest version
kubectl convert -f sample/manifest.yaml

After the existing objects have been converted and any manifest files have been updated you should be safe to upgrade Kubernetes.

Bonus

There was a fantastic episode of TGIK awhile back called Kubernetes API Removal and You that describes in great detail what all of the deprections mean and how to fix them – definitely worth a watch if you have the time.

Conclusion

OPA and testing configurations using tools like conftest and Rego policies is a great way to harden and help standardize configurations. Taken a step further, these configuration testing tools can be extended to test all sorts of other things.

Conftest looks especially promising because of the number of file types that it understands. There is a lot of potential here for doing things like unit testing Kubernetes configuration files and other things like Terraform configs.

I haven’t written any Rego policies yet but the language looks pretty straight forward and easy to deal with. I think that as configurations continue to evolve, tools like Conftest (OPA), Kubeval and Kustomize will gain more traction and help simplify some of the complexities of Kubernetes.

Quickly securing local secrets

September 15, 2019September 16, 2019 Josh Reichardt 1 Comment

One thing I have run into recently and have been thinking about a little bit lately, is a simple way to hide environment variables that contain sensitive information. For example, when working in a local environment, if you need access to a secret like an oauth token or some authentication method to an API, the first inclination is usually to just hard code the secret contents into your local bash/zsh profile so that it can be read anytime you need access to it. This method obviously will work but if the filesystem itself isn’t encrypted, the secret can easily be leaked and for a small amount of effort I believe I have found an effective way of shrinking the visibility of these secrets.

Inspired by the aws-vault tool which is a simple but secure way of storing local AWS credentials in environment variables using a local password store, in this post I will show you a quick and dirty way to add an extra layer of security to your (other) local environment by injecting sensitive secrets stored in an encrypted location (password store) into your local terminal. This method works for both OSX and Linux and is just a few lines of configuration and examples for both OSes are shown below.

In OSX the keychain is a good starting place for storing and retrieving secrets and in Linux the combination of GPG and the standard unix password manager “pass” work well together. Pass also works on OSX if you aren’t a fan of keychain.

Below are steps for storing and retrieving local secrets using the Linux pass tool. There are installation instructions and full documentation for how to use the tool in the link above. It should also be noted that the system needs to have GPG installed in order to write and read secrets.

One you have GPG configured, create the password store. I am skipping most of the GPG configuration because there is a lot to know, the command below should be enough to get things started. If you already have GPG set up and configured you can skip the setup.

Set up GPG and pass.

gpg2 --full-gen-key # follow prompts to create a gpg store with defaults
pass init <email> # use the same email address used with gpg
pass git init # optionally set pass up as a git repo

To create/edit a secret.

#pass insert/edit <secret>
pass insert mysecret
pass edit mysecret

Pass allows for hierarchies but in the example we are just going to put the secret at the top level. The command above will open the default editor. After closing the editor, the password will be written to an encrypted file in ~/.password-store. Once you have added the password you can show the contents of the newly added secret.

To read a secret into the terminal.

#pass show <secret>
pass show mysecret

You can also quickly list all of your secrets.

pass ls

Now that we have a created secret, we can write a little bash function to pull out the contents of the password and export them as an environment variable when the shell gets sourced. Put the following snippet into your ~/.bashrc, ~/.zshrc or ~/.bashprofile to read secrets.

get_password () {
  pass show "$1"
}

A similar result can be achieved in OSX using the “security” command line tool.

get_password () {
  security find-generic-password -ga "$1" -w
}

In your shell configuration file you can simply export the result of calling the get_password() function into an environment variable.

export MYSECRET="$(get_password mysecret)"

Source the shell profile to pickup the new changes. After that, you should now see the contents of the secret inside an environment variable in your terminal.

source ~/.bashrc
env | grep MYSECRET

Conclusion

Obviously this isn’t a perfect way to secure your environment since the secret is available to anyone who is able to connect to this user so make sure you practice good security in as many other ways as possible.

What this method does do though is cuts down the amount of sensitive information that can be gleaned from a user account by ensuring that shell secrets are encrypted at rest and unavailable as clear text.

Configure S3 to store load balancer logs using Terraform

March 21, 2018 Josh Reichardt Leave a comment

If you’ve ever encountered the following error (or similar) when setting up an AWS load balancer to write its logs to an s3 bucket using Terraform then you are not alone. I decided to write a quick note about this problem because it is the second time I have been bitten by this and had to spend time Googling around for an answer. The AWS documentation for creating and attaching the policy makes sense but the idea behind why you need to do it is murky at best.

aws_elb.alb: Failure configuring ELB attributes: InvalidConfigurationRequest: Access Denied for bucket: <my-bucket> Please check S3bucket permission
status code: 409, request id: xxxx

For reference, here are the docs for how to manually create the policy by going through the AWS console. This method works fine for manually creating and attaching to the policy to the bucket. The problem is that it isn’t obvious why this needs to happen in the first place and also not obvious to do in Terraform after you figure out why you need to do this. Luckily Terraform has great support for IAM, which makes it easy to configure the policy and attach it to the bucket correctly.

Below is an example of how you can create this policy and attach it to your load balancer log bucket.

data "aws_elb_service_account" "main" {}

data "aws_iam_policy_document" "s3_lb_write" {
    policy_id = "s3_lb_write"

    statement = {
        actions = ["s3:PutObject"]
        resources = ["arn:aws:s3:::<my-bucket>/logs/*"]

        principals = {
            identifiers = ["${data.aws_elb_service_account.main.arn}"]
            type = "AWS"
        }
    }
}

Notice that you don’t need to explicitly define the principal like you do when setting up the policy manually. Just use the ${data.aws_elb_service_account.main.arn} variable and Terraform will figure out the region that the bucket is in and pick out the correct parent ELB ID to attach to the policy. You can verify this by checking the table from the link above and cross reference it with the Terraform output for creating and attaching the policy.

You shouldn’t need to update anything in the load balancer config for this to work, just rerun the failed command again and it should work. For completeness here is what that configuration might look like.

...
access_logs {
    bucket = "${var.my_bucket}"
    prefix = "logs"
    enabled = true
}
...

This process is easy enough but still begs the question of why this seemingly unnecessary process needs to happen in the first place? After searching around for a bit I finally found this:

When Amazon S3 receives a request—for example, a bucket or an object operation—it first verifies that the requester has the necessary permissions. Amazon S3 evaluates all the relevant access policies, user policies, and resource-based policies (bucket policy, bucket ACL, object ACL) in deciding whether to authorize the request.

Okay, so it basically looks like when the load balancer gets created, the load balancer gets associated with an AWS owned ID, which we need to explicitly give permission to, through IAM policy:

If the request is for an operation on an object that the bucket owner does not own, in addition to making sure the requester has permissions from the object owner, Amazon S3 must also check the bucket policy to ensure the bucket owner has not set explicit deny on the object

Note

A bucket owner (who pays the bill) can explicitly deny access to objects in the bucket regardless of who owns it. The bucket owner can also delete any object in the bucket.

There we go. There is a little bit more information in the link above but now it makes more sense.

Fix the JenkinsAPI No valid crumb error

June 7, 2017 Josh Reichardt 1 Comment

If you are working with the Python based JenkinsAPI library you might run into the No valid crumb was included in the request error. The error below will probably look familiar if you’ve run into this issue.

Traceback (most recent call last):
 File "myscript.py", line 47, in <module>
 deploy()
 File "myscript.py", line 24, in deploy
 jenkins.build_job('test')
 File "/usr/local/lib/python3.6/site-packages/jenkinsapi/jenkins.py", line 165, in build_job
 self[jobname].invoke(build_params=params or {})
 File "/usr/local/lib/python3.6/site-packages/jenkinsapi/job.py", line 209, in invoke
 allow_redirects=False
 File "/usr/local/lib/python3.6/site-packages/jenkinsapi/utils/requester.py", line 143, in post_and_confirm_status
 response.text.encode('UTF-8')
jenkinsapi.custom_exceptions.JenkinsAPIException: Operation failed. url=https://jenkins.example.com/job/test/build, data={'json': '{"parameter": [], "statusCode": "303", "redirectTo": "."}'}, headers={'Content-Type': 'application/x-www-form-urlencoded'}, status=403, text=b'<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>\n<title>Error 403 No valid crumb was included in the request</title>\n</head>\n<body><h2>HTTP ERROR 403</h2>\n<p>Problem accessing /job/test/build. Reason:\n<pre> No valid crumb was included in the request</pre></p><hr><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.4.z-SNAPSHOT</a><hr/>\n\n</body>\n</html>\n'

It is good practice to enable additional security in Jenkins by turning on the “Prevent Cross Site Forgery exploits” option in the security settings, so if you see this error it is a good thing. The below example shows this security feature in Jenkins.

The Fix

This error threw me off at first, but it didn’t take long to find a quick fix. There is a crumb_requester class in the jenkinsapi that you can use to create the crumbed auth token. You can use the following example as a guideline in your own code.

from jenkinsapi.jenkins import Jenkins
from jenkinsapi.utils.crumb_requester import CrumbRequester

JENKINS_USER = 'user'
JENKINS_PASS = 'pass'
JENKINS_URL = 'https://jenkins.example.com'

# We need to create a crumb for the request first
crumb=CrumbRequester(username=JENKINS_USER, password=JENKINS_PASS, baseurl=JENKINS_URL)

# Now use the crumb to authenticate against Jenkins
jenkins = Jenkins(JENKINS_URL, username=JENKINS_USER, password=JENKINS_PASS, requester=crumb)

...

The code looks very similar to creating a normal Jenkins authentication object, the only difference being that we create and then pass in a crumb for the request, rather than just a username/password combination. Once the crumbed authentication object has been created, you can continue writing your Python code as you would normally. If you’re interested in learning more about crumbs and CSRF you can find more here, or just Google for CSRF for more info.

This issue was slightly confusing/annoying, but I’d rather deal with an extra few lines of code and know that my Jenkins server is secure.

Enable SSL for your WordPress blog

December 5, 2015November 18, 2016 Josh Reichardt Leave a comment

Updated: 11/18/16

The Let’s Encrypt client was recently renamed to “certbot”. I have updated the post to use the correct name but if I miss something use certbot or let me know.

With the announcement of the public beta of the Let’s Encrypt project, it is now nearly trivial to get your site set up with an SSL certificate. One of the best parts about the Let’s Encrypt project is that it is totally free, so there is pretty much no reason to protect your blog set up with an SSL certificate. The other nice part of Let’s Encrypt is that it is very easy to get your certificate issued.

The first step to get started is grabbing the latest source code from GitHub for the project. Log on to your WordPress server (I’m running Ubuntu) and clone the repo. Make sure to install git if you haven’t already.

git clone https://github.com/letsencrypt/certbot.git

There is a shell script you can run to pretty much do everything for you, including installation of any packages and libraries it needs as well as configures paths and other components it needs to work.

cd certbot
./certbot-auto

After the bootstrap is done there should be some CLI options. Run the command with the -h flag to print out help.

./certbot-auto -h

Since I am using Apache for my blog I will use the “–apache” option.

./certbot-auto --apache

There will be some prompts you need to go through for setting up the certificates and account creation.

This process is still somewhat error prone, so if you make a typo you can just rerun the “./letsencrypt-auto” command and follow the prompts.

The certificates will be dropped in to /etc/letsencrypt/live/<website>. Go double check them if needed.

This process will also generate a new apache configuration file for you to use. You can check for the file in /etc/apache2/site-enabled. The import part of this config should look similar to the following:

<VirtualHost *:443>
  UseCanonicalName Off
  ServerAdmin webmaster@localhost
  DocumentRoot /var/www/wordpress
  SSLCertificateFile /etc/letsencrypt/live/thepracticalsysadmin.com/cert.pem
  SSLCertificateKeyFile /etc/letsencrypt/live/thepracticalsysadmin.com/privkey.pem
  Include /etc/letsencrypt/options-ssl-apache.conf
  SSLCertificateChainFile /etc/letsencrypt/live/thepracticalsysadmin.com/chain.pem
</VirtualHost>

As a side note, you will probably want to redirect non https requests to use the encrypted connection. This is easy enough to do, just go find your .htaccess file (mine was in /var/www/wordpress/.htaccess) and add the following rules.

<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteCond %{SERVER_PORT} 80
  RewriteRule ^(.*)$ https://example.com/$1 [R,L]
</IfModule>

Before we restart Apache with the new configuration let’s run a quick configtest to make sure it all works as expected.

apachectl configtest

If everything looks okay in the configtest then you can reload or restart apache.

service apache2 restart

Now when you visit your site you should get the nice shiny green lock icon on the address bar. It is important to remember that the certificates issued by the Let’s Encrypt project are valid for 90 days so you will need to make sure to keep up to date and generate new certificates every so often. The Let’s Encrypt folks are working on automating this process but for now you will need to manually generate new certificates and reload your web server.

That’s it. Your site should now be functioning with SSL.

Updating the certificate automatically

To take this process one step further We can make a script that can be run via cron (or manually) to update the certificate.

Here’s what the script looks like.

#!/usr/bin/env bash

dir="/etc/letsencrypt/live/example.com"
acme_server="https://acme-v01.api.letsencrypt.org/directory"
domain="example.com"
https="--standalone-supported-challenges tls-sni-01"

# Using webroot method
#/root/letsencrypt/certbot-auto --renew certonly --server $acme_server -a webroot --webroot-path=$dir -d $domain --agree-tos

# Using standalone method
service apache2 stop
# Previously you had to specify options to renew the cert but this has been deprecated
#/root/letsencrypt/certbot-auto --renew certonly --standalone $https -d $domain --agree-tos
# In newer versions you can just use the renew command
/root/letsencrypt/certbot-auto renew --quiet
service apache start

Notice that I have the “webroot” method commented out. I run a service (Varnish) on port 80 that proxies traffic but also interferes with LE so I chose to run the standalone renewal method. It is pretty easy, the main difference is that you need to turn off Apache before you run it since Apache binds to to ports 80/443. But the downtime is okay in my case.

I chose to put the script in to a cron job and have it run every 45 days so that I don’t have to worry about logging on to my server to regenerate the certificate. Here’s what a sample crontab for this job might look like.

0 0 */45 * * /root/renew_cert.sh

This is a straight forward process and will help with your search engine juices as well.