log

Shipping logs to ELK

Following along in the progression of this little mini series about getting the ELK stack working on Docker, we are almost finished.  The last step after getting the ELK stack up and running (part 1) and optimizing LS and ES (part 2) is to get the logs flowing in to the ELK server.

There are a few options (actually there are a lot) for getting your logs in to Logstash and Elasticsearch.  I will be focusing on the two log shippers I found to be the most powerful and flexible for this task.  There are a variety of other options for jamming logs in to LS but for my intents and purposes they either don’t fit in with my workflow or just weren’t supported well enough.

For more info you can check various different inputs here.

Other notable projects that aren’t mentioned here are the Logstash agent, which requires the entire LS project, it is just the logging agent component.  This is a heavyweight solution but is good for testing locally.

There is also the beaver project for logging over a TCP socket, which is nice if you are either logging internally only or using a broker like Redis or Kafka.  Obviously not a great option if security of log transmission is important to you.  This would be a great solution if you are collecting the logs over a public internet connection.

logstash-forwarder

The first log shipper I started with, creatively entitled “logstash-forwarder” was created by the author of Logstash and is written in Go, so it is super fast and has a very small footprint.  Another benefit of this logging method is that connections to the LS server are wrapped in TLS so the logging agent solves the problem that straight TCP collectors have by securing the data.

There are great instructions for getting up and going on the project github page, there are even instructions for creating a Debian/RPM package out of the Go binary for an easy way to distribute the shipper.  If you plan on shipping the logs via a Docker container I would suggest looking through the docs on the github page for how to build the Debian package

The recently released version 0.4.0 was an attractive option because it added the ability to tail logs so that the LSF wouldn’t try to forwarder an entire log file if the “pipe” to the LS server got broken or the agent somehow died and needed to be restarted.  Prior to the 0.4.0 release these issues could potentially bog down or crash the LS server, record logs out of order or potentially create duplicates.

To run logstash-forwarder with the appropriate tailing flag turned on use this command.

/opt/logstash-forwarder/bin/logstash-forwarder -tail -config /etc/logstash-forwarder

A couple things to note.  The /opt/logstash-forwarder/bin/logstash-forwarder part is where the binary was installed to.  The -tail flag will tell LSF to tail the log.  The -config flag specifies where the LSF client should go look for a configuration to load.

The configuration can be as simple (or complicated) as you want.  It basically just needs a cert to communicate with the Logstash server.

{
 "network": {
   "servers": [ "<server>:<port>" ],
   "ssl certificate": "/opt/certs/logstash.crt",
   "ssl key": "/opt/certs/logstash.key",
   "ssl ca": "/opt/certs/logstash.crt",
   "timeout": 15
 },

 "files": [
 {
   "paths": [ "/var/log/*.log" ],
   "fields": { "type": "syslog" }
 }
 ]
}

By default, the LSF client can be somewhat noisy in its stdout logging (especially for a Docker container) so we can turn down the info logging so that only errors and alerts are logged.

/opt/logstash-forwarder/bin/logstash-forwarder -quiet -tail -config /etc/logstash-forwarder

There are more options of course if you are interested and you can list them out by running the binary with no additional options passed in.  But for my use case, quiet and tail were all I needed.

Since the theme of this mini series is how to get everything running in Docker, I will show you what a logstash-forwarder Docker image looks like here.  The Dockerfile for creating the logstash-forwarder image is pretty straight forward.  I have chosen to install a few extra tools in to the container that help with troubleshooting should there ever be an issue with the client running inside the container.

We also inject the deb package in to the container as well as the certs.

FROM debian:wheezy

ENV DEBIAN_FRONTEND noninteractive

# Install
RUN apt-get update && apt-get install -y -qq vim curl netcat
ADD logstash-forwarder_0.4.0_amd64.deb /tmp/
RUN dpkg -i /tmp/logstash-forwarder_0.4.0_amd64.deb

# Config
RUN mkdir -p /opt/certs/
ADD local.conf /etc/logstash-forwarder
ADD logstash-forwarder.crt /opt/certs/logstash-forwarder.crt
ADD logstash-forwarder.key /opt/certs/logstash-forwarder.key

# start lsf
CMD ["/opt/logstash-forwarder/bin/logstash-forwarder", "-quiet", "-tail", "-config", "/etc/logstash-forwarder"]

I believe there are future plans to create a logger similar to LSF but written in JRuby so it is easier to maintain and to fit more with the style of the LS project.

The last piece to get this working is the docker run command.  It will depend on your own environment but a generic run command might look like the following.  Obviously replace “<myserver>” and <org/image:tag>” with your specific information.

docker run -v /data:/data --name lsf --hostname <myserver> <org/image:tag>

Log Courier

I was having issues getting logstash-forwarder to work correctly at one point so I began to explore different options for loggers and stumbled across this awesome project.  Log Courier is like logstash-forwarder on steroids.  It is much more customizable and offers a large number of options that aren’t available in logstash-forwarder as well, such as the ability to do logs processing at the client end, which is a major major bonus over other log shippers.

The project (and its documentation) live in this github project.  The docs are very good and the maintainer is very good at responding to issues or questions so I recommend checking out the project as a reference.  Log Courier is similar to LSF in the fact that you need to build it and create a package for it, so as a prerequisite you will need to have GO installed.

Again, all of this information is on the github project and does a much better job of explaining how to get this all working.  To help alleviate some of the build issues that turn people away to this project I believe there are discussions now of creating publicly available Debian and RPM packages.

Once you have your package created and installed you can run LC as follows:

/opt/log-courier/bin/log-courier -config /etc/courier.conf

The only flag we need to pass is the -config flag.  There are a few other command line flags available but most all of the configuration for LC is done via the config file that gets passed to the client when it starts, including logging levels and other customizations.  It isn’t really mentioned here but the default behavior for LC is tail the logs so you don’t need to worry about crashing your LS server if the stream ever breaks.  LC is good at figuring out what it should do and pick up where it left off.

You can check the docs for all of the custom configurations you can pass to LC here.

Lets take a look at a what a sample configuration file might look like in LC to demonstrate some its enhanced features.

{
 "network": {
   "servers": [ "<server>:<port>" ],
   "ssl ca": "/opt/certs/courier.crt",
   "timeout": 15
 },

 "general": {
   "log level": "debug"
 },

 "files": [
 {
   "paths": [ "/data/*foo.log" ],
   "fields": { "type": "foo" }
 },
 {
   "paths": [ "/data/*bar.log" ],
   "fields": { "type": "bar" },
     "codec": {
     "name": "multiline",
     "pattern": "^%{TIMESTAMP_ISO8601} ",
     "negate": true,
     "what": "previous"
   }
 }
 ]
}

The network section is similar to LSF, you need to point the client at the correct server and you also need to tell it which cert to connect with.  Generating the cert is basically the same as it was for LSF, just use a different name.  The “general” section provides a place to set info at the global level for LC.  This configuration is also using regex expansion to do pattern matching for logs, the same way LSF does.  The most interesting part is that in this configuration we can do multiline logging at the client level which LSF does not support.  This is especially useful at taking some strain off of the server for processing and is a great reason to use LC.

And because this is another Docker example, here is the the Dockerfile.  This is very similar to the LSF Dockerfile, we are just using a different .deb file (which we created above), different certs and a different CMD to start the logger.

#FROM ubuntu:14.04
FROM debian:wheezy

ENV DEBIAN_FRONTEND noninteractive

# Install
RUN apt-get update && apt-get install -y -qq vim curl netcat
ADD log-courier_1.6_amd64.deb /tmp/
RUN dpkg -i /tmp/log-courier_1.6_amd64.deb

# Config
RUN mkdir -p /opt/certs/
ADD local.conf /etc/courier.conf
ADD courier.crt /opt/certs/courier.crt
ADD courier.key /opt/certs/courier.key

# start log courier
CMD ["/opt/log-courier/bin/log-courier", "-config", "/etc/courier.conf"]

As mentioned, I already have built the Debian package so I simply inject it in to my Docker image.  Running the Docker image is similar to LSF.

docker run -v /data:/data --name courier --hostname <myserver> <org/image:tag>

Conclusion

Some of the configurations I am using are specific to my workflow and environment but most of this can be adapted.  Running the LSF or LC clients in containers is a great way to isolate your logging client.  The reason this works so well in my scenario is because we are using the /data volume as a pattern on all of our host machines to log application specific logs to.  That makes it very easy to point the LSF and LC clients to point in the right location.  If you aren’t using any custom directories (or lots of them) you could just update your volume mounts in your docker run command to look in the specified location for logs that you expecting.

Once you have the logging workflow mastered you can start writing unit files to run these containers via systemd or fleet or injecting them in to cloud configs which makes scaling these logging containers simple.  Our environment leverages CoreOS so we write unit files in our cloud configs for the loggers which takes care of scaling this workflow.  If you aren’t using CoreOS or systemd this could probably be made to work with docker-compose but I haven’t tried it yet.

If you don’t use Docker then you can easily strip out the LSF and LC specific parts to get this working.  The main issue to work through will be creating the package for distribution and installation.  Once you have the packages you should be good to go, all of the commands and configuration being run by Docker should work the same.

Feel free to comment or let me know if you have questions.  There are a lot of moving pieces to this workflow but it becomes pretty powerful once all of the components are set up and put in place.

Josh Reichardt

Josh is the creator of this blog, a system administrator and a contributor to other technology communities such as /r/sysadmin and Ops School. You can also find him on Twitter and Facebook.