I wrote a post awhile back about how to get the ElasticSearch + Logstash + Kibana stack set up and recently have been very involved with Docker so thought it would be appropriate to update that post with the new Docker way of doing things.
Update (5/9/15) – I have created a github repo containing configs for running this. Reader Sergio also has a solution posted a similar solution on github, you can check it out here if you want to try it out.
I found a surprising lack of posts describing how to run the ELK stack with Docker and docker-compose. This post is much longer and more detailed than usual so feel free to jump around to different sections for details on different components. I am planning on doing a follow up on to this post with instructions about how to configure Logstash and the logstash-forwarder client as well as Kibana to do interesting things with logs stored in ElasticsSearch.
There are a lot of other posts about how to get the stack to work but they are either out of date already since the Docker world changes so fast or don’t cover specific details of how different bit work. The other thing I have observed is that most of the other guides are not done with Docker which is something that makes life easier.
So the first thing that I’ll cover is how to build the Docker images. If you are interested I can make this stuff available on the Docker hub as images or public Dockerfiles. However, for this article and in genreal I strongly prefer to write my own Dockerfiles so I will be posting my custom configs and files here rather than pulling other (sometimes official) prebuilt images.
Logstash
The first component we will get set up is the Logstash server. This setup is also using the log-courier input plugin. Log-courier is a more customizable and flexible client for forwarding logs to Logstash. I use both logstash-forwarder and log-courier in this configuration to allow for a more flexible setup.
The following is a Dockerfile that will build a Logstash 1.5.0 image. One thing to note about this approach is that you can swap out the LOGSTASH_VER and the image will be updated to the correct version automatically and will be ready to be deployed whem the image gets rebuilt.
FROM ubuntu:14.04 ENV DEBIAN_FRONTEND noninteractive ENV LOGSTASH_VER 1.5.0.rc2 WORKDIR /opt # Dependencies RUN apt-get update -qq && \ apt-get install -y -qq \ wget \ python \ openjdk-7-jre-headless # Install Logstash RUN wget --quiet "https://download.elasticsearch.org/logstash/logstash/logstash-$LOGSTASH_VER.tar.gz" -O "/opt/logstash-$LOGSTASH_VER.tar.gz" --no-check-certificate && \ tar zxf logstash-$LOGSTASH_VER.tar.gz && \ mv logstash-$LOGSTASH_VER logstash # Install plugins RUN /opt/logstash/bin/plugin update logstash-output-zeromq RUN /opt/logstash/bin/plugin install logstash-input-log-courier # Config files ADD server.conf /etc/logstash/server.conf ADD logstash-forwarder.key /etc/logstash/logstash-forwarder.key ADD logstash-forwarder.crt /etc/logstash/logstash-forwarder.crt # lumberjack port EXPOSE 4545 # log-courier port EXPOSE 4546 CMD /opt/logstash/bin/logstash -f /etc/logstash/server.conf
This will install version 1.5.0.r2 the logstash-input-log-courier input for logstash, add certificates for the forwarding clients and start Logstash with the server configuration.
In addition to this Dockerfile you will need to generate some certificates for logstash-forwarder clients and the Logstash server itself to use, as well as the server configuration used by the Logstash server. Below I have a sample but extremely barebones server.conf configuration file.
input { lumberjack { port => 4545 ssl_certificate => "/etc/logstash/logstash-forwarder.crt" ssl_key => "/etc/logstash/logstash-forwarder.key" codec => plain { charset => "ISO-8859-1" } } courier { port => 4546 ssl_certificate => "/etc/logstash/logstash-forwarder.crt" ssl_key => "/etc/logstash/logstash-forwarder.key" } } output { elasticsearch { cluster => "elasticsearch" } }
I will thicken this config up in the next post on how to customize Logstash and doing interesting things with Kibana. For now, we are defining a courier and lumberjack input, used to ingest logs in to Logstash as well as one output, which is telling Logstash what to do with the logs, in this example it is just stuffing them in to ES.
To generate the certificates needed for logstash/logstah-forwarder you can either follow the instructions on the logstash-forwarder github page or you can use the following command to generate the certs and subsequently inject them in to the Docker image. It should probably go without saying but make sure the version of openssl used to generate these is an update to date and secure version.
openssl req -x509 -nodes -sha256-days 1095 -newkey rsa:2048 -keyout logstash-forwarder.key -out logstash-forwarder.crt
You will need to follow a few prompts to fill out the certificate details, again you can reference the logstash-forwarder github page if you get stuck or are unsure of how to configure the certificate.
After the certs are generated make sure that the names of the output file names match up to the names in the above Dockerfile and that is pretty much it for getting Logstash ready.
ElasticSearch
The ElasticSearch image is also pretty straight forward. This will build the specified version from the ES_VER variable which is 1.4.4 currently and will configure a few things.
# Pull base image. FROM dockerfile/java:oracle-java7 # Set install version ENV ES_PKG_NAME elasticsearch-1.4.4 # Install ElasticSearch RUN \ cd / && \ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/$ES_PKG_NAME.tar.gz && \ tar xvzf $ES_PKG_NAME.tar.gz && \ rm -f $ES_PKG_NAME.tar.gz && \ mv /$ES_PKG_NAME /elasticsearch # Define mountable directories VOLUME ["/data"] # Define working directory WORKDIR /data # Custom ES config ADD elasticsearch.yml /elasticsearch/config/elasticsearch.yml # Define default command CMD ["/elasticsearch/bin/elasticsearch"] # Expose ports EXPOSE 9200 EXPOSE 9300
The main key to getting ES to work is getting the configuration file set up correctly. In this example we are mounting local storage (/data) from the host OS in to the container so that if the container dies the data and indexes and other data aren’t wiped out. There are also a few other security configuration settings that get set here to lock things down and also to make Kibana 4 happy.
http.cors.allow-origin: "/.*/" http.cors.enabled: true cluster.name: elasticsearch node.name: "logstash.domain.com" path: data: /data/index logs: /data/log plugins: /data/plugins work: /data/work
ES is very straight forward to set up, you just set it up and it runs.
Kibana
This will build the newest iteration of Kibana, which is 4.0.0 as of this writing. If you aren’t living on the bleeding edge and want to know how to get Kibana 3.x.x working let me know and I will post the configuration for it.
FROM ubuntu:14.04 # Dependencies RUN apt-get update -qq RUN sudo apt-get install -y -qq nginx-full wget vim # Kibana RUN mkdir -p /opt/kibana RUN wget https://download.elasticsearch.org/kibana/kibana/kibana-4.0.0-linux-x64.tar.gz -O /tmp/kibana.tar.gz && \ tar zxf /tmp/kibana.tar.gz && mv kibana-4.0.0-linux-x64/* /opt/kibana/ # Configs ADD kibana.yml /opt/kibana/config/kibana.yml EXPOSE 5601 CMD /opt/kibana/bin/kibana
So the Dockerfile is pretty straight forward but there were a few tidbits to be aware of. Kibana 4.x.x if significantly different in how it works than 3.x.x so you will need to make a few adjustments if you are familiar with the old version.
You will need to pick and choose the bits out of the following configuration to suit your needs. For example, you will need to adjust the elasticsearch_url, username, password and will need to decide whether to turn ssl on or off. There are obviously more options but most of them probably don’t need to be adjusted for now. Here is what the sample config looks like.
port: 5601 host: "0.0.0.0" elasticsearch_url: "http://logstash.domain.com:9200" elasticsearch_preserve_host: true kibana_index: ".kibana" default_app_id: "discover" request_timeout: 300000 shard_timeout: 0 verify_ssl: false # Plugins that are included in the build, and no longer found in the plugins/ folder bundled_plugin_ids: - plugins/dashboard/index - plugins/discover/index - plugins/doc/index - plugins/kibana/index - plugins/markdown_vis/index - plugins/metric_vis/index - plugins/settings/index - plugins/table_vis/index - plugins/vis_types/index - plugins/visualize/index
That’s pretty much it, most of the difficulty of getting the new version of Kibana working is in the configuration so if you want to tweak things or if something isn’t working that is the first place to look.
Docker Compose (glue the pieces together)
This is an integral part of the setup. This is what controls the different containers and what glues everything together. Luckily it is easy to get set up and working. If you aren’t familiar, docker-compose was recently rebranded from the old “fig” tool which has been branded as a Docker orchestration tool for running complex Docker applications easily.
The official docs are pretty good and detailed so you can visit their site if you have any questions about how to install or how to get any of the components working here.
The first step is to download and install docker compose. Here I am using and Ubuntu system.
sudo pip install -U docker-compose
There are a few docker-compose command line commands to be familiar with which we’ll get to next, but first I will post the sample docker-compose configuration file to test out your ELK stack.
kibana: build: /home/<user>/elk/kibana/4.0.0 restart: always ports: - "5601:5601" links: - "elasticsearch:elasticsearch" elasticsearch: build: /home/<user>/elk/elasticsearch/1.4.4 restart: always ports: - "9200:9200" - "9300:9300" volumes: - "/data:/data" logstash: build: /home/<user>/elk/logstash/logstash-1.5.0 restart: always ports: - "4545:4545" - "4546:4546"
Most of the configuration is straight forward. Here are the main commands to get everything stitched to gether and working.
- docker-compose build (from the directory the docker-compose.yml file is in)
- docker-compose up (to test the stack)
- docker-compose kill (bring it down)
After you have all the issues ironed out building and running and the stack is stable with no errors on start you can start up the stack in detached mode.
- docker-compose up -d
Additionally, you can look at the logs if something smells fishy.
- docker-compose logs
Design considerations
One thing that readers might wonder about is the scalability of this setup. This will scale up very easily but not out. However, this should be able to handle up to 100k events/second on the Logstash end so there will be other bottlenecks before the components (ES and Logstash) fall down. I haven’t pushed my own setup this far yet but have been able to get to around 30k/sec before Logstash dies, which I’m still investigating. Even with that amount of activity and Logstash choking, ES and Kibana don’t get affected.
So if you use this as a guide for a production setup I would recommend that you use a decently sized server, at least 4 CPU, 8 GB memory and adjust the memory and cpu options for the Logstash component if you plan on throwing a lot of logs at it (>30k/s). I will revisit once I have worked out all the performance issues with some best practices for making Logstash run more smoothly.
I would be interested in knowing how to scale this out if anybody has any recommendations but this setup should scale up decently at least for most scenarios. I have not played around with ES sharding across hosts but I imagine that it wouldn’t be super complicated, especially using container volume mounts to store the data and indexes at the hose OS level.