Export SNMP metrics with the Prometheus Operator

There are quite a few use cases for monitoring outside of Kubernetes, especially for previously built infrastructure and otherwise legacy systems.  Additional monitoring adds an extra layer of complexity to your monitoring setup and configuration, but fortunately Prometheus makes this extra complexity easier to manage and maintain, inside of Kubernetes.

In this post I will describe a nice clean way to monitor things that are internal to Kubernetes using Prometheus and the Prometheus Operator.  The advantage of this approach is that it allows the Operator to manage and monitor infrastructure, and it allows Kubernetes to do what it’s good at; make sure the things you want are running for you in an easy to maintain, declarative manifest.

If you are already familiar with the concepts in Kubernetes then this post should be pretty straight forward.  Otherwise, you can pretty much copy/paste most of these manifests into your cluster and you should have a good way to monitor things in your environment that are external to Kubernetes.

Below is an example of how to monitor external network devices using the Prometheus SNMP exporter.  There are many other exporters that can be used to monitor infrastructure that is external to Kubernetes but currently it is recommended to set up these configurations outside of the Prometheus Operator to basically separate monitoring concerns (which I plan on writing more about in the future).

Create the deployment and service

Here is what the deployment might look like.

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: snmp-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: snmp-exporter
  template:
    metadata:
    labels:
      app: snmp-exporter
  spec:
    containers:
    - image: oakman/snmp-exporter
    command: ["/bin/snmp_exporter"]
    args: ["--config.file=/etc/snmp_exporter/snmp.yml"]
    name: snmp-exporter
    ports:
    - containerPort: 9116
      name: metrics

And the accompanying service.

apiVersion: v1
kind: Service
metadata:
  labels:
    app: snmp-exporter
  name: snmp-exporter
spec:
  ports:
  - name: http-metrics
    port: 9116
    protocol: TCP
    targetPort: metrics
  selector:
    app: snmp-exporter

At this point you would have a pod in your cluster, attached to a static IP address.  To see if it worked you can check to make sure a service IP was created.  The service is basically what the Operator uses to create targets in Prometheus.

kubectl get sv

From this point you can 1) set up your own instance of Prometheus using Helm or by deploying via yml manifests or 2) set up the Prometheus Operator.

Today we will walk through option 2, although I will probably cover option 1 at some point in the future.

Setting up the Prometheus Operator

The beauty of using the Prometheus Operator is that it gives you a way to quickly add or change Prometheus specific configuration via the Kubernetes API (custom resource definition) and some custom objects provided by the operator, including AlertManager, ServiceMonitor and Prometheus objects.

The first step is to install Helm, which is a little bit outside of the scope of this post but there are lots of good guides on how to do it.  With Helm up and running you can easily install the operator and the accompanying kube-prometheus manifests which give you access to lots of extra Kubernetes metrics, alerts and dashboards.

helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/
helm install --name prometheus-operator --set rbacEnable=true --namespace monitoring coreos/prometheus-operator
helm install coreos/kube-prometheus --name kube-prometheus --namespace monitoring

After a few moments you can check to see that resources were created correctly as a quick test.

kubectl get pods -n monitoring

NOTE: You may need to manually add the “prometheus” service account to the monitoring namespace after creating everything.  I ran into some issues because Helm didn’t do this automatically.  You can check this with kubectl get events.

Prometheus Operator configuration

Below are steps for creating custom objects (CRDs) that the Prometheus Operator uses to automatically generate configuration files and handle all of the other management behind the scenes.

These objects are wired up in a way that configs get reloaded and Prometheus will automatically get updated when it sees a change.  These object definitions basically convert all of the Prometheus configuration into a format that is understood by Kubernetes and converted to Prometheus configuration with the operator.

First we make a servicemonitor for monitoring the the snmp exporter.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: snmp-exporter
    prometheus: kube-prometheus # tie servicemonitor to correct Prometheus
  name: snmp-exporter
spec:
  jobLabel: k8s-app
  selector:
    app: snmp-exporter
  namespaceSelector:
    matchNames:
    - monitoring

  endpoints:
  - interval: 60s
    port: http-metrics
    params:
      module:
      - if_mib # Select which SNMP module to use
      target:
      - 1.2.3.4 # Modify this to point at the SNMP target to monitor
    path: "/snmp"
    targetPort: 9116

Next, we create a custom alert and tie it our Prometheus Operator.  The alert doesn’t do anything useful, but is a good demo for showing how easy it is to add and manage alerts using the Operator.

Create an alert-example.yml configuration file, add it as a configmap to k8s and mount it in as a configuration with the ruleSelector label selector and the prometheus operator will do the rest. Below shows how to hook up a test rule into an existing Prometheus (kube-prometheus) alert manager, handled by the prometheus-operator.

kind: ConfigMap
apiVersion: v1
metadata:
 name: josh-test
 namespace: monitoring
 labels:
 role: alert-rules # Standard convention for organizing alert rules
 prometheus: kube-prometheus # tie to correct Prometheus
data:
 test.rules.yaml: |
 groups:
 - name: test.rules # Top level description in Prometeheus
 rules:
 - alert: TestAlert
 expr: vector(1)

Once you have created the rule definition via configmap just use kubectl to create it.

kubectl create -f alert-example.yml -n monitoring

Testing and troubleshooting

You will probably need to port forward the pod to get access to the IP and port in the cluster

kubectl port-forward snmp-exporter-<name> 9116

Then you should be able to visit the pod in your browser (or with curl).

localhost:9116

The exporter itself does a lot more so you will probably want to play around with it.  I plan on covering more of the details of other external exporters and more customized configurations for the SNMP exporter.

For example, if you want to do any sort of monitoring past basic interface stats, etc. you will need to generate and build your own set of MIBs to gather metrics from your infrastructure and also reconfigure your ServiceMonitor object in Kubernetes to use the correct MIBs so that the Operator updates the configuration correctly.

Conclusion

The amount of options for how to use Prometheus is one area of confusion when it comes to using Prometheus, especially for newcomers.  There are lots of ways to do things and there isn’t much direction on how to use them, which can also be viewed as a strength since it allows for so much flexibility.

In some situations it makes sense to use an external (non Operator managed Prometheus) when you need to do things like manage and tune your own configuration files.  Likewise, the Prometheus Operator is a great fit when you are mostly only concerned about managing and monitoring things inside Kubernetes and don’t need to do much external monitoring.

That said, there is some support for external monitoring using the Prometheus Operator, which I want to write about in a different post.  This support is limited to a handful of different external exporters (for the time being) so the best advice is to think about what kind of monitoring is needed and choose the best solution for your own use case.  It may turn out that both types of configurations are needed, but it may also end up being just as easy to use one method or another to manage Prometheus and its configurations.

Read More

Mount a volume using Ignition and Terraform

Sometimes when provisioning a server you may want to configure and provision storage as part of the bootstrapping and booting process.  For example, the other day I ran into an issue where I needed to define a disk, partition it, mount it to a specified location and then create a few directories in it.  It turned out to be surprisingly not straight forward to provision this storage and I learned quite a few things that I thought were worth sharing.

I’d just like to mention that ignition works like magic.  If you aren’t familiar, Ignition is basically a tool to help provision and configure servers, very similar to cloud-config except by default Ignition only runs once, on first boot.  The magic of Ignition is that it injects itself into initramfs before the OS ever eve boots and manipulating the system.  Ignition can be read in from  remote URL so that it can easily be provisioned in bare metal infrastructures.  There were several pieces to this puzzle.

The first was getting down all of the various ignition configuration components in Terraform.  Nothing was particularly complicated, there was just a lot of trial and error to get everything working.  Terraform has some really nice documentation for working with Ignition configurations, I’d recommend starting there and just playing around to figure out some of the various bits and pieces of configuration that Ignition can do.  There is some documentation on Ignition troubleshooting as well which I found to be helpful when things weren’t working correctly.

Below each portion of the Ignition configuration gets declared inside of a “ignition_config” block.  The Ignition configuration then points towards each invidual component that we want Ignition to configure. e.g. systemd, filesystem, directories, etc.

data "ignition_config" "staging_rancher_host_stateful" {
  systemd = [
     "${data.ignition_systemd_unit.mount_data.id}",
  ]

  filesystems = [
    "${data.ignition_filesystem.data_fs.id}",
  ]

  directories = [
    "${data.ignition_directory.data_dir.id}",
  ]

  disks = [
    "${data.ignition_disk.data_disk.id}",
  ]
}

This part of the setup is pretty straight forward.  Create a data block with the needed ignition configuration to mount the disk to the correct location,  format the device if it hasn’t already been formatted and create the desired directory and then create the Systemd unit to configure the mount point for the OS.  Here’s what each of the data blocks might look like.

data "ignition_filesystem" "data_fs" {
   name = "data"

  mount {
    device = "/dev/xvdb1"
    format = "ext4"
  }
}

data "ignition_directory" "data_dir" {
  filesystem = "data"
  path = "/data"
  uid = 500
  gid = 500
}

data "ignition_disk" "data_disk" {
  device = "/dev/xvdb"

  partition {
    number = 1
    start = 0
    size = 0
  }
}

Next, create the Systemd unit.

data "ignition_systemd_unit" "mount_data" {
  content = "${file("./data.mount")}"
  name = "data.mount"
}

Another challenge was getting the Systemd unit to mount the disk correctly.  I don’t work with Systemd frequently so initially had some trouble figuring this part out.  Basically, Systemd expects the service/unit definition name to EXACTLY match what’s declared inside the “Where” clause of the service definition.

For example, the following configuration needs to be named data.mount because that is what is defined in the service.

[Unit]
Description=Mount /data
Before=local-fs.target

[Mount]
What=/dev/xvdb1
Where=/data
Type=ext4

[Install]
WantedBy=local-fs.target

After all the kinks have been worked out of the Systemd unit(s) and other above Terraform Ignition configuration you should be able to deploy this and have Ignition provision disks for you automatically when the OS comes up.  This can be extended as much as needed for getting initial disks  set up correctly and is a huge step in automating your infrastructure in a nice repeatable way.

There is currently an open issue with Ignition currently where it breaks when attempting to re-provision a previously configured disk on a new machine.  Basically the Ignition process chokes because it sees the device has already been partitioned and formatted and can’t do it again.  I ran into this scenario where I was trying to create a basically floating persistent data EBS volume that gets attached to servers in an autoscaling group and wanted to allow the volume to be able to move around freely if the server gets killed off.

Read More

Test Rancher 2.0 using Minikube

If you haven’t heard yet, Rancher recently revealed news that they will be building out a new v2.0 of their popular container orchestration and management platform to be built specifically to run on top of Kubernetes.  In the container realm, Kubernetes has recently become a clear favorite in the battle of orchestration and container management.  There are still other options available, but it is becoming increasingly clear that Kubernetes has the largest community, user base and overall feature set so a lot of the new developments are building onto Kubernetes rather than competing with it directly.  Ultimately I think this move to build on Kubernetes will be good for the container and cloud community as companies can focus more narrowly now on challenges tied specifically around security, networking, management, etc, rather than continuing to invent ways to run containers.

With Minikube and the Docker for Mac app, testing out this new Rancher 2.0 functionality is really easy.  I will outline the (rough) process below, but a lot of the nuts and bolts are hidden in Minikube and Rancher.  So if you’re really interested in learning about what’s happening behind the scenes, you can take a look at the Minikube and Rancher logs in greater detail.

Speaking of Minkube and Rancher, there are a few low level prerequisites you will need to have installed and configured to make this process work smoothly, which are listed out below.

Prerequisites

  • Tested on OSX
  • Get Minikube working – use the Kubernetes/Minikube notes as a reference (you may need to bump memory to 4GB)
  • Working version of kubectl
  • Install and configure docker for mac app

I won’t cover the installation of these perquisites, but I have blogged about a few of them before and have provided links above for instructions on getting started if you aren’t familiar with any of them.

Get Rancher 2.0 working locally

The quick start guide on the Rancher website has good details for getting this working – http://rancher.com/docs/rancher/v2.0/en/quick-start-guide/.  On OSX you can use the Docker for Mac app to get a current version of Docker and compose.  After Docker is installed, the following command will start the Rancher container for testing.

docker run -d --restart=unless-stopped -p 8080:8080 --name rancher-server rancher/server:preview

Check that you can access the Rancher 2.0 UI by navigating to http://localhost:8080 in your browser.

If you wanted to dummy a host name to make access a little bit easier you could just add an extra entry to /etc/hosts.

Import Minikube

You can import an existing cluster into the Rancher environment.  Here we will import the local Minikube instance we got going earlier so we can test out some of the new Rancher 2.0 functionality.  Alternately you could also add a host from a cloud provider.

In Rancher go to Hosts, Use Existing Kubernetes.

Use existing Kubernetes

Then grab the IP address that your local machine is using on your network.  If you aren’t familiar, on OSX you can reach into the terminal and type “ifconfig” and pull out the IP your machine is using.  Also make sure to set the port to 8080, unless you otherwise modified the port map earlier when starting Rancher.

host registration url

Registering the host will generate a command to run that applies configuration on the Kubernetes cluster.  Just copy this kubectl command in Rancher and run it against your Minikube machine.

kubectl url

The above command will join Minikube into the Rancher environment and allow Rancher to manage it.  Wait a minute for the Rancher components (mainly the rancher-agent continer/pod) to bootstrap into the Minikube environment.  Once everything is up and running, you can check things with kubectl.

kubectl get pods --all-namespaces | grep rancher

Alternatively, to verify this, you can open the Kubernetes dashboard with the “minikube dashboard” command and see the rancher-agent running.

kubernetes dashboard

On the Rancher side of things, after a few minutes, you should see the Minikube instance show up in the Rancher UI.

rancher dashboard

That’s it.  You now have a working Rancher 2.0 instance that is connected to a Kubernetes cluster (Minikube).  Getting the environment to this point should give you enough visibility into Rancher and Kubernetes to start tinkering and learning more about the new features that Rancher 2.0 offers.

The new Rancher 2.0 UI is nice and simplifies a lot of the painful aspects of managing and administering a Kubernetes cluster.  For example, on each host, there are metrics for memory, cpu, disk, etc. as well as specs about the server and its hardware.  There are also built in conveniences for dealing with load balancers, secrets and other components that are normally a pain to deal with.  While 2.0 is still rough around the edges, I see a lot of promise in the idea of building a management platform on top Kubernetes to make administrative tasks easier, and you can still exec to the container for the UI and check logs easily, which is one of my favorite parts about Rancher.  The extra visualization is a nice touch for folks that aren’t interested in the CLI or don’t need to know how things work at a low level.

When you’re done testing, simply stop the rancher container and start it again whenever you need to test.  Or just blow away the container and start over if you want to start Rancher again from scratch.

Read More

Templated Nginx configuration with Bash and Docker

Shoutout to @shakefu for his Nginx and Bash wizardry in figuring a lot of this stuff out.  I’d like to take credit for this, but he’s the one who got a lot of it working originally.

Sometimes it can be useful to template Nginx files to use environment variables to fine tune and adjust control for various aspects of Nginx.  A recent example of this idea that I recently worked on was a scenario where I setup an Nginx proxy with a very bare bones configuration.  As part of the project, I wanted a quick and easy way to update some of the major Nginx configurations like the port it uses to listen for traffic, the server name, upstream servers, etc.

It turns out that there is a quick and dirty way to template basic Nginx configurations using Bash, which ended up being really useful so I thought I would share it.  There are a few caveats to this method but it is definitely worth the effort if you have a simple setup or a setup that requires some changes periodically.  I stuck the configuration into a Dockerfile so that it can be easily be updated and ported around – by using the nginx:alpine image as the base image the total size all said and done is around 16MB.  If you’re not interested in the Docker bits, feel free to skip them.

The first part of using this method is to create a simple configuration file that will be used to substitute in some environment variables.  Here is a simple template that is useful for changing a few Nginx settings.  I called it nginx.tmpl, which will be important for how the template gets rendered later.

events {}

http {
  error_log stderr;
  access_log /dev/stdout;

  upstream upstream_servers {
    server ${UPSTREAM};
  }

  server {
    listen ${LISTEN_PORT};
    server_name ${SERVER_NAME};
    resolver ${RESOLVER};
    set ${ESC}upstream ${UPSTREAM};

    # Allow injecting extra configuration into the server block
    ${SERVER_EXTRA_CONF}

    location / {
       proxy_pass ${ESC}upstream;
    }
  }
}

The configuration is mostly straight forward.  We are basically just using this configuration file and inserting a few templated variables denoted by the ${VARIABLE} syntax, which are just environment variables that get inserted into the configuration when it gets bootstrapped.  There are a few “tricks” that you may need to use if your configuration starts to get more complicated.  The first is the use of the ${ESC} variable.  Nginx uses the ‘$’ for its variables, which also is used by the template.  The extra ${ESC} basically just gives us a way to escape that $ so that we can use Nginx variables as well as templated variables.

The other interesting thing that we discovered (props to shakefu for this magic) was that you can basically jam arbitrary server block level configurations into an environment variable.  We do this with the ${SERVER_EXTRA_CONF} in the above configuration and I will show an example of how to use that environment variable later.

Next, I created a simple Dockerfile that provides some default values for some of the various templated variables.  The Dockerfile aslso copies the templated configuration into the image, and does some Bash magic for rendering the template.

FROM nginx:alpine

ENV LISTEN_PORT=8080 \
  SERVER_NAME=_ \
  RESOLVER=8.8.8.8 \
  UPSTREAM=icanhazip.com:80 \
  UPSTREAM_PROTO=http \
  ESC='$'

COPY nginx.tmpl /etc/nginx/nginx.tmpl

CMD /bin/sh -c "envsubst < /etc/nginx/nginx.tmpl > /etc/nginx/nginx.conf && nginx -g 'daemon off;' || cat /etc/nginx/nginx.conf"

There are some things to note.  First, not all of the variables in the template need to be declared in the Dockerfile, which means that if the variable isn’t set it will be blank in the rendered template and just won’t do anything.  There are some variables that need defaults, so if you ever run across that scenario you can just add them to the Dockerfile and rebuild.

The other interesting thing is how the template gets rendered.  There is a tool built into the shell called envsubst that substitutes the values of environment variables into files.  In the Dockerfile, this tool gets executed as part of the default command, taking the template as the input and creating the final configuration.

/bin/sh -c "envsubst < /etc/nginx/nginx.tmpl > /etc/nginx/nginx.conf

Nginx gets started in a slightly silly way so that daemon mode can be disabled (we want Nginx running in the foreground) and if that fails, the rendered template gets read to help look for errors in the rendered configuration.

&& nginx -g 'daemon off;' || cat /etc/nginx/nginx.conf"

To quickly test the configuration, you can create a simple docker-compose.yml file with a few of the desired environment variables, like I have below.

version: '3'
services:
  nginx_proxy:
    build:
      context: .
      dockerfile: Dockerfile
    # Only test the configuration
    #command: /bin/sh -c "envsubst < /etc/nginx/nginx.tmpl > /etc/nginx/nginx.conf && cat /etc/nginx/nginx.conf"
    volumes:
      - "./nginx.tmpl:/etc/nginx/nginx.tmpl"
    ports:
      - 80:80
    environment:
    - SERVER_NAME=_
    - LISTEN_PORT=80
    - UPSTREAM=test1.com
    - UPSTREAM_PROTO=https
    # Override the resolver
    - RESOLVER=4.2.2.2
    # The following would add an escape if it isn't in the Dockerfile
    # - ESC=$$

Then you can bring up Nginx server.

docker-compose up

The configuration doesn’t get rendered until the container is run, so to test the configuration only, you could add in a command in the docker-compose file that renders the configuration and then another command that spits out the rendered configuration to make sure it looks right.

If you are interested in adding additional configuration you can use the ${SERVER_EXTRA_CONF} as eluded to above.  An example of this extra configuration can be assigned to the environment variable.  Below is an arbitrary snippet that allows for connections to do long polling to Nginx, which basically means that Nginx will try to hold the connection open for existing connections for longer.

error_page 420 = @longpoll;
if ($arg_wait = "true") { return 420; }
}
location @longpoll {
# Proxy requests to upstream
proxy_pass $upstream;
# Allow long lived connections
proxy_buffering off;
proxy_read_timeout 900s;
keepalive_timeout 160s;
keepalive_requests 100000;

The above snipped would be a perfectly valid environment variable as far as the container is concerned, it will just look a little bit weird to the eye.

nginx proxy environment variables

That’s all I’ve got for now.  This minimal templated Nginx configuration is handy for testing out simple web servers, especially for proxies and is also nice to port around using Docker.

Read More

Top Five Reasons to Use a Hybrid Cloud

Hybrid cloud

Guest post by Aventis Systems

Cloud computing is increasing in popularity as business users have become more comfortable with cloud capabilities. According to a recent VMware report, some 15% of workloads currently reside in the public cloud, and 50% are projected to be running in the public cloud by 2030.

Some business customers will move completely to the public cloud, drawn by its ability to help them respond to changing business needs, align costs and stay on the cutting edge of innovation. But a complete move isn’t the best option for all customers. Some processes still simply run more efficiently and securely on on-premise hardware.

For many businesses, there is no one-size-fits-all solution. Ultimately, using a mix of public clouds and private infrastructures is the best way for many companies to make the most of their resources while optimizing performance and productivity.

What Is a Hybrid Cloud?
A hybrid cloud is a combination of a private cloud platform designed for use by a specific organization and a public cloud provider like Amazon Web Services (AWS) or Google Cloud. These public clouds are shared by customers all over the world and are a cheaper alternative to buying physical servers.

Though the public and private cloud platforms operate independently from one another, they can communicate over an encrypted connection.

A hybrid approach enables data and applications to move between the public and private infrastructures. These are independent platforms, so businesses can store protected data on the private cloud while still leveraging applications that rely on that protected data on the public cloud.

In other words, your sensitive data stays out of the public cloud and on the private platform. The challenge is integrating the different public and private clouds and technologies in a way that is seamless for business users.

Here are some of the top benefits of a hybrid cloud approach, according to the experts at Aventis Systems:

1. Workload Flexibility
With hybrid cloud technology, your IT team has the flexibility to match resources with the infrastructure that best serves the needs of your business.

For example, an integrated hybrid cloud approach with VMware Cloud on AWS enables you to decide where to most effectively run workloads based on cost, risk and changing business needs. With the flexibility to move workloads onsite or offsite as needed, IT is able to better serve the business as a whole.

The ability for organizations to easily transition applications without having to re-platform them — along with the ability to effortlessly access and leverage native cloud services — enables businesses to create a flexible infrastructure in a constantly evolving IT landscape. When new technology becomes available or new trends emerge, businesses are agile enough to take advantage of them quickly.

2. Consistency and Scalability
VMware Cloud on AWS enables companies to leverage operational consistency, along with scalability, on one streamlined platform. By maintaining security and networking policies, along with consistent resource utilization both on- and off-premise, businesses can benefit most from a hybrid infrastructure.

Customers can strategically leverage and allocate company resources to get the most out of system functionality, while becoming better positioned for growth. As your business and capacity needs grow, a hybrid cloud infrastructure offers an easy way to scale to fit these complex needs.

3. Improved Security
Maintaining secure customer transaction data and personal information with a hybrid cloud infrastructure also offers a major benefit over an exclusively public platform. The hybrid approach enables specified servers to be isolated from specific security threats by allowing devices to be configured to communicate with them on a private network.

Where some compliance requirements prevent businesses from running payments in the cloud, for example, a hybrid cloud platform allows you to house secure customer data on a dedicated server, while maintaining the flexibility and convenience of online transactions.

4. Maximized Skillsets and Cost Optimization
Not only are hybrid clouds less expensive to manage, with VMware Cloud on AWS, business customers can also reap the benefits of utilizing their existing IT investments.

Hybrid cloud offerings integrate with your existing IT and use many of the same tools as those used on-premise. You can leverage the resources you already have without having to adopt new tools or acquire new hardware.

Additionally, an integrated hybrid cloud approach enables customers to better align their costs to business needs. Upfront costs can be balanced with recurring expenses, depending on the requirements.

5. Innovation
With a hybrid cloud approach, your business will have access to all of the resources on the public cloud without the burden of big upfront investments. With access to all the newest technologies and innovations, you can stay on the forefront of the latest capabilities.

As businesses become more comfortable and reliant on cloud capabilities, more and more companies will look for the right mix of public cloud and on-premise infrastructure models to increase efficiency and performance.

Read More