Export SNMP metrics with the Prometheus Operator

There are quite a few use cases for monitoring outside of Kubernetes, especially for previously built infrastructure and otherwise legacy systems.  Additional monitoring adds an extra layer of complexity to your monitoring setup and configuration, but fortunately Prometheus makes this extra complexity easier to manage and maintain, inside of Kubernetes.

In this post I will describe a nice clean way to monitor things that are internal to Kubernetes using Prometheus and the Prometheus Operator.  The advantage of this approach is that it allows the Operator to manage and monitor infrastructure, and it allows Kubernetes to do what it’s good at; make sure the things you want are running for you in an easy to maintain, declarative manifest.

If you are already familiar with the concepts in Kubernetes then this post should be pretty straight forward.  Otherwise, you can pretty much copy/paste most of these manifests into your cluster and you should have a good way to monitor things in your environment that are external to Kubernetes.

Below is an example of how to monitor external network devices using the Prometheus SNMP exporter.  There are many other exporters that can be used to monitor infrastructure that is external to Kubernetes but currently it is recommended to set up these configurations outside of the Prometheus Operator to basically separate monitoring concerns (which I plan on writing more about in the future).

Create the deployment and service

Here is what the deployment might look like.

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: snmp-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: snmp-exporter
  template:
    metadata:
    labels:
      app: snmp-exporter
  spec:
    containers:
    - image: oakman/snmp-exporter
    command: ["/bin/snmp_exporter"]
    args: ["--config.file=/etc/snmp_exporter/snmp.yml"]
    name: snmp-exporter
    ports:
    - containerPort: 9116
      name: metrics

And the accompanying service.

apiVersion: v1
kind: Service
metadata:
  labels:
    app: snmp-exporter
  name: snmp-exporter
spec:
  ports:
  - name: http-metrics
    port: 9116
    protocol: TCP
    targetPort: metrics
  selector:
    app: snmp-exporter

At this point you would have a pod in your cluster, attached to a static IP address.  To see if it worked you can check to make sure a service IP was created.  The service is basically what the Operator uses to create targets in Prometheus.

kubectl get sv

From this point you can 1) set up your own instance of Prometheus using Helm or by deploying via yml manifests or 2) set up the Prometheus Operator.

Today we will walk through option 2, although I will probably cover option 1 at some point in the future.

Setting up the Prometheus Operator

The beauty of using the Prometheus Operator is that it gives you a way to quickly add or change Prometheus specific configuration via the Kubernetes API (custom resource definition) and some custom objects provided by the operator, including AlertManager, ServiceMonitor and Prometheus objects.

The first step is to install Helm, which is a little bit outside of the scope of this post but there are lots of good guides on how to do it.  With Helm up and running you can easily install the operator and the accompanying kube-prometheus manifests which give you access to lots of extra Kubernetes metrics, alerts and dashboards.

helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/
helm install --name prometheus-operator --set rbacEnable=true --namespace monitoring coreos/prometheus-operator
helm install coreos/kube-prometheus --name kube-prometheus --namespace monitoring

After a few moments you can check to see that resources were created correctly as a quick test.

kubectl get pods -n monitoring

NOTE: You may need to manually add the “prometheus” service account to the monitoring namespace after creating everything.  I ran into some issues because Helm didn’t do this automatically.  You can check this with kubectl get events.

Prometheus Operator configuration

Below are steps for creating custom objects (CRDs) that the Prometheus Operator uses to automatically generate configuration files and handle all of the other management behind the scenes.

These objects are wired up in a way that configs get reloaded and Prometheus will automatically get updated when it sees a change.  These object definitions basically convert all of the Prometheus configuration into a format that is understood by Kubernetes and converted to Prometheus configuration with the operator.

First we make a servicemonitor for monitoring the the snmp exporter.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: snmp-exporter
    prometheus: kube-prometheus # tie servicemonitor to correct Prometheus
  name: snmp-exporter
spec:
  jobLabel: k8s-app
  selector:
    app: snmp-exporter
  namespaceSelector:
    matchNames:
    - monitoring

  endpoints:
  - interval: 60s
    port: http-metrics
    params:
      module:
      - if_mib # Select which SNMP module to use
      target:
      - 1.2.3.4 # Modify this to point at the SNMP target to monitor
    path: "/snmp"
    targetPort: 9116

Next, we create a custom alert and tie it our Prometheus Operator.  The alert doesn’t do anything useful, but is a good demo for showing how easy it is to add and manage alerts using the Operator.

Create an alert-example.yml configuration file, add it as a configmap to k8s and mount it in as a configuration with the ruleSelector label selector and the prometheus operator will do the rest. Below shows how to hook up a test rule into an existing Prometheus (kube-prometheus) alert manager, handled by the prometheus-operator.

kind: ConfigMap
apiVersion: v1
metadata:
 name: josh-test
 namespace: monitoring
 labels:
 role: alert-rules # Standard convention for organizing alert rules
 prometheus: kube-prometheus # tie to correct Prometheus
data:
 test.rules.yaml: |
 groups:
 - name: test.rules # Top level description in Prometeheus
 rules:
 - alert: TestAlert
 expr: vector(1)

Once you have created the rule definition via configmap just use kubectl to create it.

kubectl create -f alert-example.yml -n monitoring

Testing and troubleshooting

You will probably need to port forward the pod to get access to the IP and port in the cluster

kubectl port-forward snmp-exporter-<name> 9116

Then you should be able to visit the pod in your browser (or with curl).

localhost:9116

The exporter itself does a lot more so you will probably want to play around with it.  I plan on covering more of the details of other external exporters and more customized configurations for the SNMP exporter.

For example, if you want to do any sort of monitoring past basic interface stats, etc. you will need to generate and build your own set of MIBs to gather metrics from your infrastructure and also reconfigure your ServiceMonitor object in Kubernetes to use the correct MIBs so that the Operator updates the configuration correctly.

Conclusion

The amount of options for how to use Prometheus is one area of confusion when it comes to using Prometheus, especially for newcomers.  There are lots of ways to do things and there isn’t much direction on how to use them.  In some situations it makes sense to use an external (non Operator managed Prometheus) when you need to do things like manage and tune your own configuration files.  Likewise, the Prometheus Operator is a great fit when you are mostly only concerned about managing and monitoring things inside Kubernetes and don’t need to do much external monitoring.

That said, there is some support for external monitoring using the Prometheus Operator, which I will be showing later.  This support is limited to a handful of different external exporters so the best advice is to think about what kind of monitoring is needed and choose the best solution for your own use case.  It may turn out that both types of configurations are needed, but it may also end up being just as easy to use one method or another to manage Prometheus and its configurations.

Read More

Mount a volume using Ignition and Terraform

Sometimes when provisioning a server you may want to configure and provision storage as part of the bootstrapping and booting process.  For example, the other day I ran into an issue where I needed to define a disk, partition it, mount it to a specified location and then create a few directories in it.  It turned out to be surprisingly not straight forward to provision this storage and I learned quite a few things that I thought were worth sharing.

I’d just like to mention that ignition works like magic.  If you aren’t familiar, Ignition is basically a tool to help provision and configure servers, very similar to cloud-config except by default Ignition only runs once, on first boot.  The magic of Ignition is that it injects itself into initramfs before the OS ever eve boots and manipulating the system.  Ignition can be read in from  remote URL so that it can easily be provisioned in bare metal infrastructures.  There were several pieces to this puzzle.

The first was getting down all of the various ignition configuration components in Terraform.  Nothing was particularly complicated, there was just a lot of trial and error to get everything working.  Terraform has some really nice documentation for working with Ignition configurations, I’d recommend starting there and just playing around to figure out some of the various bits and pieces of configuration that Ignition can do.  There is some documentation on Ignition troubleshooting as well which I found to be helpful when things weren’t working correctly.

Below each portion of the Ignition configuration gets declared inside of a “ignition_config” block.  The Ignition configuration then points towards each invidual component that we want Ignition to configure. e.g. systemd, filesystem, directories, etc.

data "ignition_config" "staging_rancher_host_stateful" {
  systemd = [
     "${data.ignition_systemd_unit.mount_data.id}",
  ]

  filesystems = [
    "${data.ignition_filesystem.data_fs.id}",
  ]

  directories = [
    "${data.ignition_directory.data_dir.id}",
  ]

  disks = [
    "${data.ignition_disk.data_disk.id}",
  ]
}

This part of the setup is pretty straight forward.  Create a data block with the needed ignition configuration to mount the disk to the correct location,  format the device if it hasn’t already been formatted and create the desired directory and then create the Systemd unit to configure the mount point for the OS.  Here’s what each of the data blocks might look like.

data "ignition_filesystem" "data_fs" {
   name = "data"

  mount {
    device = "/dev/xvdb1"
    format = "ext4"
  }
}

data "ignition_directory" "data_dir" {
  filesystem = "data"
  path = "/data"
  uid = 500
  gid = 500
}

data "ignition_disk" "data_disk" {
  device = "/dev/xvdb"

  partition {
    number = 1
    start = 0
    size = 0
  }
}

Next, create the Systemd unit.

data "ignition_systemd_unit" "mount_data" {
  content = "${file("./data.mount")}"
  name = "data.mount"
}

Another challenge was getting the Systemd unit to mount the disk correctly.  I don’t work with Systemd frequently so initially had some trouble figuring this part out.  Basically, Systemd expects the service/unit definition name to EXACTLY match what’s declared inside the “Where” clause of the service definition.

For example, the following configuration needs to be named data.mount because that is what is defined in the service.

[Unit]
Description=Mount /data
Before=local-fs.target

[Mount]
What=/dev/xvdb1
Where=/data
Type=ext4

[Install]
WantedBy=local-fs.target

After all the kinks have been worked out of the Systemd unit(s) and other above Terraform Ignition configuration you should be able to deploy this and have Ignition provision disks for you automatically when the OS comes up.  This can be extended as much as needed for getting initial disks  set up correctly and is a huge step in automating your infrastructure in a nice repeatable way.

There is currently an open issue with Ignition currently where it breaks when attempting to re-provision a previously configured disk on a new machine.  Basically the Ignition process chokes because it sees the device has already been partitioned and formatted and can’t do it again.  I ran into this scenario where I was trying to create a basically floating persistent data EBS volume that gets attached to servers in an autoscaling group and wanted to allow the volume to be able to move around freely if the server gets killed off.

Read More

Configure S3 to store load balancer logs using Terraform

If you’ve ever encountered the following error (or similar) when setting up an AWS load balancer to write its logs to an s3 bucket using Terraform then you are not alone.  I decided to write a quick note about this problem because it is the second time I have been bitten by this and had to spend time Googling around for an answer.  The AWS documentation for creating and attaching the policy makes sense but the idea behind why you need to do it is murky at best.

aws_elb.alb: Failure configuring ELB attributes: InvalidConfigurationRequest: Access Denied for bucket: <my-bucket> Please check S3bucket permission
status code: 409, request id: xxxx

For reference, here are the docs for how to manually create the policy by going through the AWS console.  This method works fine for manually creating and attaching to the policy to the bucket.  The problem is that it isn’t obvious why this needs to happen in the first place and also not obvious to do in Terraform after you figure out why you need to do this.  Luckily Terraform has great support for IAM, which makes it easy to configure the policy and attach it to the bucket correctly.

Below is an example of how you can create this policy and attach it to your load balancer log bucket.

data "aws_elb_service_account" "main" {}

data "aws_iam_policy_document" "s3_lb_write" {
    policy_id = "s3_lb_write"

    statement = {
        actions = ["s3:PutObject"]
        resources = ["arn:aws:s3:::<my-bucket>/logs/*"]

        principals = {
            identifiers = ["${data.aws_elb_service_account.main.arn}"]
            type = "AWS"
        }
    }
}

Notice that you don’t need to explicitly define the principal like you do when setting up the policy manually.  Just use the ${data.aws_elb_service_account.main.arn} variable and Terraform will figure out the region that the bucket is in and pick out the correct parent ELB ID to attach to the policy.  You can verify this by checking the table from the link above and cross reference it with the Terraform output for creating and attaching the policy.

You shouldn’t need to update anything in the load balancer config for this to work, just rerun the failed command again and it should work.  For completeness here is what that configuration might look like.

...
access_logs {
    bucket = "${var.my_bucket}"
    prefix = "logs"
    enabled = true
}
...

This process is easy enough but still begs the question of why this seemingly unnecessary process needs to happen in the first place?  After searching around for a bit I finally found this:

When Amazon S3 receives a request—for example, a bucket or an object operation—it first verifies that the requester has the necessary permissions. Amazon S3 evaluates all the relevant access policies, user policies, and resource-based policies (bucket policy, bucket ACL, object ACL) in deciding whether to authorize the request.

Okay, so it basically looks like when the load balancer gets created, the load balancer gets associated with an AWS owned ID, which we need to explicitly give permission to, through IAM policy:

If the request is for an operation on an object that the bucket owner does not own, in addition to making sure the requester has permissions from the object owner, Amazon S3 must also check the bucket policy to ensure the bucket owner has not set explicit deny on the object

Note

A bucket owner (who pays the bill) can explicitly deny access to objects in the bucket regardless of who owns it. The bucket owner can also delete any object in the bucket.

There we go.  There is a little bit more information in the link above but now it makes more sense.

Read More

Bash tricks

bash

Update 2/18/18 – add some handy alt shortcuts

Bash is great.  As I have discovered over the years, Bash contains many different layers, like a good movie or a fine wine.  It is fun to explore and expose these different layers and find uses for them.  As my experience level has increased, I have (slowly) uncovered a number of these features of Bash that make life easier and worked to incorporate them in different ways into my own workflows and use them within my own style.

The great thing about fine arts, Bash included, is that there are so many nuances and for Bash, a huge number of features and uses, which makes the learning process that much more fun.

It does take a lot of time and practice to get used to the syntax and to become effective with these shortcuts.  I use this page as a reference whenever I think of something that sounds like it would be useful and could save time in a script or a command.  At first, it may take more time to look up how to use these shortcuts, but eventually, with practice and drilling will become second nature and become real time savers.

Shell shortcuts

Navigating the Bash shell is easy to do.  But it takes time to learn how to do well.  Below are a number of shortcuts that make the navigation process much more efficient.  I use nearly all of the shortcuts daily (except Ctrl + t and Ctrl + xx, which I only recently discovered).  In a similar vein, I wrote a separate post long ago about setting up CLI shortcuts on iterm that can further augment the capabilities of the CLI.

This is a nice reference with more examples and features

  • Ctrl + a => Return to the start of the command you’re typing
  • Ctrl + e => Go to the end of the command you’re typing
  • Ctrl + u => Cut everything before the cursor to a special clipboard
  • Ctrl + k => Cut everything after the cursor to a special clipboard
  • Ctrl + y => Paste from the special clipboard that Ctrl + u and Ctrl + k save their data to
  • Ctrl + t => Swap the two characters before the cursor (you can actually use this to transport a character from the left to the right, try it!)
  • Ctrl + w => Delete the word / argument left of the cursor
  • Ctrl + l => Clear the screen
  • Ctrl + _ => Undo previous key press
  • Ctrl + xx => Toggle between current position and the start of the line

There are some nice Alt key shortcuts in Linux as well.  You can map the alt key in OSX pretty easily to unlock these shortcuts.

  • Alt + l => Uncapitalize the next word that the cursor is under (If the cursor is in the middle of the the word it will capitalize the last half of the word).
  • Alt + u => Capitalize the word that the cursor is under
  • Alt + t => Swap words or arguments that the cursor is under with the previous
  • Alt + . => Paste the last word of the previous command
  • Alt + b => Move backward one word
  • Alt + f => Move forward one word
  • Alt + r => Undo any changes that have been done to the current command

Argument tricks

Argument tricks can help to grow the navigation capabilities that Bash shortcuts provide and can even further speed up your effectiveness in the terminal.  Below is a list of special arguments that can be passed to any command that can be expanded into various commands.

Repeating

  • !! => Repeat the previous (full) command
  • !foo => Repeat the most recent command that starts with ‘foo‘ (e.g. !ls)
  • !^ => Repeat the first argument of the previous command
  • !$ => Repeat the last argument of the previous command
  • !* => Repeat all arguments of last command
  • !:<number> => Repeat a specifically positioned argument
  • !:1-2 => Repeat a range of arguments

Printing

  • !$:p => Print out the word that !$ would substitute
  • !*:p => Print out the previous command except for the last word
  • !foo:p =>Print out the command that !foo would run

Special parameters

When writing scripts , there are a number of special parameters you can feed into the shell.  This can be convenient for doing lots of different things in scripts.  Part of the fun of writing scripts and automating things is discovering creative ways to fit together the various pieces of the puzzle in elegant ways.  The “special” parameters listed below can be seen as pieces of the puzzle, and can be very powerful building blocks in your scripts.

Here is a full reference from the Bash documentation

  • $* => Expand parameters. Expands to a single word for each parameter separated by IFS delimeter – think spaces
  • $@ => Expand parameters. Each parameter expand to a separate word, enclosed by “” –  think arrays
  • $# => Expand the number of parameters of a command
  • $? => Expand the exit status of the previous command
  • $$ => Expand the pid of the shell
  • $! => Expand the pid of the most recent command
  • $0 => Expand the name of the shell or script
  • $_ => Expand the last previous argument

Conclusion

There are some many crevices and cracks of Bash to explore, I keep finding new and interesting things about Bash that lead down new paths and help my skills grow.  I hope some of these tricks give you some ideas that can help and improve your own Bash style and workflows in the future.

Read More

Writing For Tech

As my career has progressed, I have discovered writing to be an invaluable skill to develop and polish as an engineer.  The skill of writing well translates to a number of areas outside of tech including things like writing good emails, networking and chat using real time collaboration tools (IRC, Slack, etc.), writing documentation, writing specs, or even just asking for help in online formats like message boards or communicating on social media sites.

For example, when asking for help in a technical public forum, e.g. GitHub issues or Stack Overflow, knowing exactly what the problem you are having and describing it in a way that makes sense to others (who often don’t speak English as a first language) is much more difficult than it looks.  It takes time and practice to learn how to craft questions well and to frame technical problems in easy to understand ways.  In my own experience, people are almost always happy to help but I’ve seen so many bad questions on Stack Overflow.

There are two books that I recently read that have had a tremendous impact on how I think about and approach writing, which has helped me grow as a writer, engineer, and technical collaborator, which I’d like to share with readers today.  These books have been around for a long time so if you’ve already heard of them or it has been some time since reading them, I encourage you to reread or at least skim through them again.

The first book, On Writing Well: The Classic Guide to Writing Nonfiction is a great book and really forced me think a lot about my writing and what I could be doing better.  Instead of focusing on a lot of the mechanics and building blocks of writing (it does touch on these a little bit), On Writing Well focuses mainly on the style and how to make your writing better by making it more interesting and less wasteful.  The book teaches readers that often times, more is less in writing, and it focuses on teaching lessons of simplicity as well brevity, boiling things down to their simplest forms and avoiding certain traps and pitfalls.

The second book is called The Elements of Style, Fourth Edition.  There are some really good tricks and tidbits in this book that 100% improved my writing fundamentals and mechanics, even without much practice outside of reading the book.  I would highly recommend this book for anybody that is interesting in improving the foundations of their writing, from things like improving vocabulary to improving the structure and overall quality of their writing.  The book is fairly short so doesn’t take long to work through and is a great tool for improvement and you will more than likely find some tips that are immediately useful in your own writing style.

Getting better at writing is a process, just like learning any other skill.  The more time you spend thinking about it and practicing, the better you will get.  Obviously in my own personal experience, having this blog has been a great way for me to learn and grow my writing skills.  Not every blog post is a success in my eyes but I have learned lessons from doing things over and over again and discovered things that work or don’t work.

One lesson from On Writing Well that has really stuck with me is the idea that your writing should be written for yourself.  Instead of thinking about things that other people want, or what you think they want, just write about things that are interesting or that have personal meaning and the writing process will be much more rewarding.  Applying this concept makes the process of writing much more enjoyable and keeps the gears turning.

Another idea from the book that stuck with me is that everybody has their own style of writing and none of them are bad.  So if you feel pressure to write or create a certain way, don’t.  Your writing process works best for you and that is fine, you just need to find it if you don’t know what it is already.  One of the most important lessons in writing that I have discovered over the years is that I’m not really interested in writing my blog posts according to any set of formulas or criteria.

In my own writing process, I usually like to find a problem that is interesting or challenging to me, sit down and just start writing.  This process helps me internalize and understand the problem I am attempting to solve better, as well gives me a platform to help others.  I attribute my own process and writing style to a lot of practice and just using the lessons I have learned to eventually build up my own style, which works for me.

Good luck and happy writing.

Read More