Set up Drone on arm64 Kubernetes clusters

Continuing with the multiarch and Kubernetes narratives that I have been writing about for awhile now, in this post I will be showing off some of the capabilities of the Drone.io continuous integration tool running on my arm64 Kubernetes homelab cluster.

As arm64 continues to gain popularity, more and more projects are adding support for it, including Drone.io. With its semi recent announcement, Drone can now run on a variety of different architectures now. Likewise, the Drone maintainers have also been working towards a 1.0 release, which brings first class support for Kubernetes, among other things.

I won’t touch on too many of the specifics of Drone, but if you’re interested in learning more, you can check out the website. I will mostly be focusing on how to get things running in Kubernetes, which turns out to be exactly the same for both amd64 and arm64 architectures. There were a few things I discovered along the way to get the Kubernetes integrations working but for the most part things “just work”.

I started off by grabbing the Helm chart and modifying it to suit my needs. I find it easiest to template the chart and then save it off locally so I can play around with it.

Note: the below example assumes you already have helm installed locally.

git clone [email protected]:helm/charts.git && cd charts
helm template --name drone --namespace cicd \
   --set 'sourceControl.provider=github' \
   --set 'sourceControl.github.clientID=XXXXXXXX' \
   --set 'sourceControl.secret=drone-server-secrets' \
   --set 'server.host=drone.example.com' \
   --set 'server.kubernetes.enabled=false' \
   stable/drone > /tmp/manifest.yaml

Obviously you will want to set the configurations to match your own settings, like domain name and oauth settings (I used Github).

After saving out the manifest, the first issue I ran into is that port 9000 is still referenced in the Helm chart, which was used for communication between the client and server in the older releases, but is no longer used. So I just completely removed the references to the port in my Frankenstein configuration. If you are just using the Kubernetes configuration mentioned below, you won’t run into these problems connecting the server and agent, but if you use the agent you will.

There is some server config that will need to adjusted as well to get things working. For example, the oauth settings will need to be created on the Github side first in order for any of this to work. Also, the drone server host will need to be accessible from the internet so any firewall rules will need to be added or adjusted to allow traffic.

 env:
  # Webhook setings
  - name: DRONE_ALWAYS_AUTH
    value: "false"
  - name: DRONE_SERVER_HOST
    value: "drone.example.com"
  - name: DRONE_SERVER_PROTO
    #value: http
    value: https
  # Agent config
  - name: DRONE_RPC_SECRET
    valueFrom:
      secretKeyRef:
        name: drone
        key: secret
  # Server config
  - name: DRONE_DATABASE_DATASOURCE
    value: "/var/lib/drone/drone.sqlite"
  - name: DRONE_DATABASE_DRIVER
    value: "sqlite3"
  - name: DRONE_LOGS_DEBUG
    value: "true"
  - name: DRONE_LOGS_PRETTY
    value: "true"
  - name: DRONE_USER_CREATE
    value: "username:<github_user>,machine:false,admin:true,token:abc123"
  # Github config
  - name: DRONE_GITHUB_CLIENT_ID
    value: abcd
  - name: DRONE_GITHUB_SERVER
    value: https://github.com
  - name: DRONE_GITHUB_CLIENT_SECRET
    valueFrom:
      secretKeyRef:
        name: client-secret-drone
        key: secret

Add the DRONE_USER_CREATE env var to bootstrap an admin account when starting the Drone server. This will allow your user to do all of the admin things using the CLI tool.

The secrets so should get generated when you dump the Helm chart, so feel free to update those with any values you may need.

Note: if you have double checked all of your settings but builds aren’r being triggered, there is a good chance that the webhook is the problem. There is a really good post about how to troubleshoot these settings here.

Running Drone with the Kubernetes integration

This option turned out to be the easier approach. Just add the following configuration to the drone server deployment environment variables, updating according to your environment. For example, the namespace I am deploying to is called “cicd”, which will need to be updated if you choose a different namespace.

- name: DRONE_KUBERNETES_ENABLED
  value: "true"
- name: DRONE_KUBERNETES_NAMESPACE
  value: cicd
- name: DRONE_KUBERNETES_SERVICE_ACCOUNT
  value: drone-pipeline

The main downside to this method is that it creates Kubernetes jobs for each build. By default, once these builds are done, the will exit and not clean themselves up, so if you do a lot of builds then your namespace will get clogged up. There is a way to set TTLs on finished to clean themselves up in newer versions of Kubernetes via the TTLAfterFinished flag but this functionality isn’t default in Kubernetes yet and is a little bit out of the scope of this post.

Running Drone with the agent configuration

The agent uses the sidecar pattern to run a Docker in Docker (dind) container to connect to the Docker socket in order to allow the Drone agent to do its builds.

The main downside of using this approach is that there seems to be an issue (sometimes) where the Drone components can’t talk to the Docker socket, you can find a better description of this problem and more details here. The problem seems to be a race condition where the docker socket is not being able to be mounted before the agent comes up, but I still haven’t totally solved the problem there yet. The advice for getting around this is to run the agent on a dedicated stand alone host to avoid race conditions and other pitfalls.

That being said, if you still want to use this method you will need to add an additional deployment to the config for the drone agent. If you use the agent you can disregard the above Kubernetes environment variable configurations and instead set appropriate environment variables in the agent. Below is the working snipped I used for deploying the agent to my test cluster.

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: drone-agent
  labels:
    app: drone
    component: agent
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: drone
        component: agent
    spec:
      serviceAccountName: drone
      containers:
      - name: agent
        image: "docker.io/drone/agent:1.0.0-rc.6"
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 3000
          protocol: TCP
        env:
          # This value should point to the Drone server service
          - name: DRONE_RPC_SERVER
            value: http://drone.cicd
          - name: DRONE_RPC_SECRET
            valueFrom:
              secretKeyRef:
                name: drone
                key: secret
          - name: DOCKER_HOST
            value: tcp://localhost:2375
          - name: DRONE_LOGS_DEBUG
            value: "true"
          # Uncomment this for additional trace logs
          #- name: DRONE_LOGS_TRACE
          #  value: "true"
          - name: DRONE_LOGS_PRETTY
            value: "true"

      - name: dind
        image: "docker.io/library/docker:18.06-dind"
        imagePullPolicy: IfNotPresent
        env:
        - name: DOCKER_DRIVER
          value: overlay2

        securityContext:
          privileged: true

        volumeMounts:
          - name: docker-graph-storage
            mountPath: /var/lib/docker
      volumes:
      - name: docker-graph-storage
        emptyDir: {}

I have gotten the agent to work, I just haven’t had very much success getting it working consistently. I would avoid using this method unless you have to or as mentioned above, get a dedicated host for running builds on.

Testing it out

After wiring everything up, the rest is easy. Add a file called .drone.yml to a repository that you would like to automate builds for. You can find out more about the various capabilities here.

For my use case I wanted to tell drone to build and publish an arm64 based Docker image whenever a change to master occurs. You can look at my drone configuration here to get a better idea of the multiarch options as well as authenticating to a Docker registry.

After adding the .drone.yml to your repo and triggering a build you should see something similar in your local Drone instance.

Sample Drone build

If everything worked correctly and is green then you should be good to go. Obviously there is a lot of overhead that Kubernetes brings but the actual Drone setup if really straight forward. Just set stuff up on the Github side, translate it into Kubernetes configurations and add some other Drone specific config options and you should have a nice CI/CD environment ready to go.

Conclusion

I am very excited to see Drone grow and mature and use it more in the future. It is simple yet flexible and it fits nicely into the new paradigm of using containers for everything.

The new 1.0 YAML syntax is really nice as well, as it basically maps to the conventions that Kubernetes has chosen, so if you are familiar with that syntax you should feel at home. You can check out the available plugins here, which cover about 90% of the use cases you would see in the wild.

One downside is that YAML syntax errors are really hard to debug, and there isn’t much in the way of output to figure out where your problems are. The best approach I have found is to run the .drone.yml file through the Drone CLI lint/fmt tool before committing and building.

The Drone CLI tool is really powerful and could probably be its own post. There are some links in the references that show off some of its other features.

References

There are a few cool Drone resources I would definitely recommend checking out if you are interested running Drone in your own environment. The docs reference is a great place to start and is great for finding information about how to configure and tweak various Drone settings.

https://docs.drone.io/reference/

Here is a link to the CLI reference.

https://github.com/drone/awesome-drone

I also definitely recommend checking out the jsonnet extension docs, which can be used to help improve automation workflows. The second link show an good example of how it works and the third link shows some practical applications of using jsonnet to help manage complicated CI pipelines.

https://docs.drone.io/extend/config/jsonnet/
https://github.com/drone/drone/blob/master/.drone.jsonnet
https://medium.com/dazn-tech/simplify-your-ci-pipeline-configuration-with-jsonnet-5a96cd9ccc51

Here is a link for various cool drone stuff, including blog posts and tools.

https://docs.drone.io/cli/

Read More

Multiarch Docker builds using Shippable

Recently I have been experimenting with different ways of building multi architecture Docker images.  As part of this process I wrote about Docker image manifests and the different ways you can package multi architecture builds into a single Docker image.  Packaging the images is only half the problem though.  You basically need to create the different Docker images for the different architectures first, before you are able to package them into manifests.

There are several ways to go about building the Docker images for various architectures.  In the remainder of this post I will be showing how you can build Docker images natively against arm64 only as well as amd64/arm64 simultaneously using some slick features provided by the folks at Shippable.  Having the ability to automate multi architecture builds with CI is really powerful because it avoids having to use other tools or tricks which can complicate the process.

Shippable recently announced integrated support for arm64 builds.  The steps for creating these cross platform builds is fairly straight forward and is documented on their website.  The only downside to using this method is that currently you must explicitly contact Shippable and requests access to use the arm64 pool of nodes for running jobs, but after that multi arch builds should be available.

For reference, here is the full shippable.yml file I used to test out the various types of builds and their options.

Arm64 only builds

After enabling the shippable_shared_aarch64 node pool (from the instruction above) you should have access to arm64 builds, just add the following block to your shippable.yml file.

runtime:
  nodePool: shippable_shared_aarch64

The only other change that needs to be made is to point the shippable.yaml file at the newly added node pool and you should be ready to build on arm64.  You can use the default “managed” build type in Shippable to create builds.

Below I have a very simple example shippable.yml file for building a Dockerfile and pushing its image to my Dockerhub account.  The shippable.yml file for this build lives in the GitHub repo I configured Shippable to track.

language: none

runtime:
  nodePool:
    - shippable_shared_aarch64
    - default_node_pool

build:

  ci:
    - sed -i 's|registry.fedoraproject.org/||' Dockerfile.fedora-28
    - docker build -t local/freeipa-server -f Dockerfile.fedora-28 .
    - tests/run-master-and-replica.sh local/freeipa-server

  post_ci:
    - docker tag local/freeipa-server jmreicha/freeipa-server:test
    - docker push jmreicha/freeipa-server:test

integrations:
  hub:
    - integrationName: dockerhub
      type: dockerRegistryLogin

Once you have a shippable.yml file in a repo that you would like to track and also have things set up on the Shippable side, then every time a commit/merge happens on the master branch (or whatever branch you set up Shippable to track) an arm64 Docker image gets built and pushed to the Dockerhub.

Docs for settings up this CI style job can be found here.  There are many other configuration settings available to tune so I would encourage you to read the docs and also play around with the various options.

Parallel arm64 and amd64 builds

The approach for doing the simultaneous parallel builds is a little bit different and adds a little bit more complexity, but I think is worth it for the ability to automate cross platform builds.  There are a few things to note about the below configuration.  You can use templates in either style job.  Also, notice the use of the shipctl command.  This tool basically allows you to mimic some of the other functionality that exists in the default runCI jobs, including the ability to login to Docker registries via shell commands and manage other tricky parts of the build pipeline, like moving into the correct directory to build from.

Most of the rest of the config is pretty straight forward.  The top level jobs directive lets you create multiple different jobs, which in turn allows you to set the runtime to use different node pools, which is how we build against amd64 and arm64.  Jobs also allow for setting different environment variables among other things.  The full docs for jobs shows all of the various capabilities of these jobs.

templates: &build-test-push
  - export HUB_USERNAME=$(shipctl get_integration_field "dockerhub" "username")
  - export HUB_PASSWORD=$(shipctl get_integration_field "dockerhub" "password")
  - docker login --username $HUB_USERNAME --password $HUB_PASSWORD
  - cd $(shipctl get_resource_state "freeipa-container-gitRepo")
  - sed -i 's|registry.fedoraproject.org/||' Dockerfile.fedora-27
  - sed -i 's/^# debug:\s*//' Dockerfile.fedora-27
  - docker build -t local/freeipa-server -f Dockerfile.fedora-27 .
  - tests/run-master-and-replica.sh local/freeipa-server
  - docker tag local/freeipa-server jmreicha/freeipa-server:$arch
  - docker push jmreicha/freeipa-server:$arch

resources:
    - name: freeipa-container-gitRepo
      type: gitRepo
      integration: freeipa-container-gitRepo
      versionTemplate:
          sourceName: jmreicha/freeipa-container
          branch: master

jobs:
  - name: build_amd64
    type: runSh
    runtime:
      nodePool: default_node_pool
      container: true
    integrations:
      - dockerhub
    steps:
      - IN: freeipa-container-gitRepo
      - TASK:
          runtime:
            options:
              env:
                - privileged: --privileged
                # Also look at using SHIPPABLE_NODE_ARCHITECTURE env var
                - arch: amd64
          script:
            - *build-test-push

  - name: build_arm64
    type: runSh
    runtime:
      nodePool: shippable_shared_aarch64
      container: true
    integrations:
      - dockerhub
    steps:
      - IN: freeipa-container-gitRepo
      - TASK:
          runtime:
            options:
              env:
                - privileged: --privileged
                - arch: arm64
          script:
            - *build-test-push

As you can see, there is a lot more manual configuration going on here than the first job.

I decided to use the top level templates directive to basically DRY the configuration so that it can be reused.  I am also setting environment variables per job to ensure the correct architecture gets built and pushed for the various platforms.  Otherwise the configuration is mostly straight forward.  The confusion with these types of jobs if you haven’t set them up before mostly comes from figuring out where things get configured in the Shippable UI.

Conclusion

I must admit, Shippable is really easy to get started with, has good support and has good documentation.  I am definitely a fan and will recommend and use their products whenever I get a chance.  If you are familiar with Travis then using Shippable is easy.  Shippable even supports the use of Travis compatible environment variables, which makes porting over Travis configs really easy.  I hope to see more platforms and architectures supported in the future but for now arm64 is a great start.

There are some downside to using the parallel builds for multi architecture builds.  Namely there is more overhead in setting up the job initially.  With the runSh (and other unmanaged jobs) you don’t really have access to some of the top level yml declarations that come with managed jobs, so you will need to spend more time figuring out how to wire up the logic manually using shell commands and the shipctl tool as depicted in my above example.  This ends up being more flexible in the long run but also harder to understand and get working to begin with.

Another downside of the assembly line style jobs like runSh is that they currently can’t leverage all the features that the runCI job can, including the matrix generation (though there is a feature request to add it in the future) and report parsing.

The last downside when setting up unmanaged jobs is trying to figure out how to wire up the different components on the Shippable side of things.  For example you don’t just create a runCI job like the first example.  You have to first create an integration with the repo that you are configuring so that shippable can make an rSync and serveral runSh jobs to connect with the repo and be able to work correctly.

Overall though, I love both of the runSh and runCI jobs.  Both types of jobs lend themselves to being flexible and composable and are very easy to work with.  I’d also like to mention that the support has been excellent, which is a big deal to me.  The support team was super responsive and helpful trying to sort out my issues.  They even opened some PRs on my test repo to fix some issues.  And as far as I know, there are no other CI systems currently offering native arm64 builds which I believe will become more important as the arm architecture continues to gain momentum.

Read More

Limit Jenkins Multibranch Pipeline Builds

Update: Thanks to Torben Kerr for adding instructions for setting up these properties at the multibranch build level rather then for each Jenkinsfile.  You can check out his “fix” here.

As the Jenkins pipeline functionality continues to rapidly evolve – the project documentation (or lack thereof), has been a consistent pain point as a user. Invariably, the documentation is either out of date or completely missing.  I expect the docs to improve as the project matures, but for now, the cake is a lie.  I ran into this roadblock recently, looking for a way to limit the number of concurrent builds that happen in Jenkins, using the pipeline.  In all of my anguish, I hope this post will help others in avoiding the tediousness of finding the seemingly simple functionality of limiting concurrent builds, as well as give some insight into strategies for figuring out how to find undocumented features in Jenkins.

While this feature is fairly obvious for old-style Jenkins jobs, a simple check box in the job configuration – finding the same functionality for pipelines is seemingly non existent.  Through extensive Googling and Stack Overflowing, I discovered this feature was recently added to the Multibranch plugin.  Specifically, I found an issue in the (awful) issue tracker used by Jenkins, which in turn led me to uncover some code in a semi recent PR that basically allows concurrency to be turned on or off.  Of course when I tried to use the code from the PR it didn’t work right away.  So I had to go deeper.

Eventually, I  stumbled across a SO post that discusses how to use the properties functionality of pipelines.  Equipped with this new piece of information, I finally had enough substance to start playing around with the code.  To make the creation of pipelines easier, Jenkins also recently added a snippet generator, which allows users to build out sample snippets quickly.

To use the snippet generator, either drill into an existing pipeline style job using a similar URL as below:

https://jenkins.example.com/job/<jobname>/pipeline-syntax/

Or create a new job, and click on the “Pipeline Syntax” link after it has been created to test out different snippets.

pipeline syntax

Inside the snippet generator there are a number of “steps” to choose from.  From the information I had already gathered, I just selected the properties step to create the basic skeleton of what I wanted and was able to use the disableConcurrentBuilds() function I found earlier. Below is a snippet of what the code in your Jenkinsfile might actually look like:

node {
 // This oneliner is what limits concurrent builds
 properties([disableConcurrentBuilds()])

 // Do stuff
 ...
}

Yep.  That’s it.  Just make sure to put the properties() function at the beginning of the node block, otherwise concurrency won’t be adjusted right away and could lead to problems.  Another thing to note; the step to disable concurrency could just as easily be moved into workflow libraries and applied at the global level and applied at the beginning of all jobs if you wanted to limit concurrency for all pipeline builds, since the code is just Groovy.  Finally, the code will disable concurrent builds on a per branch basis.  Essentially, if you push many different branches it will still build all of them, it will just limit each branch to one build at a time and will queue up jobs for any commits that get pushed after the initial job has been created.  I know that is a mouthful.  Let me know in the comments if this explanation needs any clarification.

While I love open source software, sometimes project’s move so fast that certain areas of it get neglected.  I am thankful for things like Github, because I was able use it to piece together all the other information I found to come up with a solution.  But, I would argue having good documentation not only saves folks like me the time and energy of the crazy searches, it also makes it much easier for potentially new users to look at, and understand what is going on.  I will be 100% honest and say that Jenkins pipelines are not for the faint of heart, and I’m sure there are many others who will agree with this sentiment.  I know it is easier said than done, but anything right now would be an improvement in my opinion.

Read More