CI/CD tools landscape

For any software development team is critical to delivering value as quickly as possible, safely and reliably. It is proven that speed of delivery is directly correlated with the organisation performance (see, e.g. State of DevOps report. So the delivery process influence company valuation and is critical for scaling the engineering effort withholding the desired quality of the product. How to achieve this is one of the cornerstones of DevOps and SRE. To get an idea of how the modern software delivery works in a successful company, see how delivery pipeline works in AWS.
This blog post is by no means a replacement for deep-dive specialised or best practice literature, e.g. Continous Delivery, Continous Integration but rather an evaluation of the current tooling landscape which can help in achieving project goals. No need to mention that tool alone won’t make the magic happen without correct delivery pipeline design. But in the post, we will focus solely on the tooling.

Firstly, let’s clarify the essential terms (for reference, see a good article from Atlassian)

Continuous integration (CI) is the practice of automating the integration of code changes from multiple contributors into a single software project.

Continuous Delivery (CD) is the ability to get changes of all types—including new features, configuration changes, bug fixes and experiments— into production, or into the hands of users, safely and quickly in a sustainable way.

Continuous deployment (CD) is a strategy for software releases wherein any code commit that passes the automated testing phase is automatically released into the production environment. It is paramount to software delivery processes. 

The delivery process is critical in any software company. From my perspective and experience current state is far from being “solved”, and the number of tools appearing every year confirms that. The amount of money spend by VCs is just confirming that. The majority of tools are imperative, while the next big trend seems to be a “declarative” CI/CD tooling. Curious about what the future will bring.

Wide variety of tools available (by no means list is extensive):

Our selection and evaluation criteria base on our current and future needs:

  • cost-effective (auto-scaling workers, etc.)
  • cost of maintenance
  • speed of development/ability to contribute
  • manual approval stage
  • ability to pass certification (audit-ability, permission and roles, etc.)
  • multi-cloud support
  • support VMs + kubernetes deployments + potentially serverless
  • ability to integrate Infrastructure as Code to delivery pipeline
  • do not scratch all our development infra (keep in mind cost/benefit ratio)
  • majority of our workloads are running in GCP
  • deals with mono-repo
  • support for long term support (LTS) branches

Following tools made it into shortlist for evaluation and deep dive. See dedicated post for each of those:

Summary:
Our ideal solution would be tooling provided by our primary cloud provider, which meets our current and near feature needs and is fully managed. We partially matched that with a combination of Cloud Build and Spinnaker for GCP based on tutorial provided by GCP.
Generally, my impression from the study and evaluation of tools listed is that claim of “full CI/CD” support are neither great in CI nor CD and lay somewhere in the middle. They provide a platform a let you code the rest. Another pain point is to tackle the monorepo and provide a means to be efficient. Platforms seem to be somewhat pricy, and the amount of infra work needed is not that low to justify it when providing all necessary features. Curious about what the Harness will provide in this space.
Not promoting the combination with end up with but was clear win moving away from Concourse CI. Where missing resource management for stages was a total killer, insufficient authorisation and role management and absence of manual steps was clear do not continue this journey. For a fresh new project, a GitLab would be a brainer to start with. It provides all needed for development, but when the project grows significantly, it can become pricy, and you are motivated even by GitLab to move partially to your infrastructure. Needless to say, that setup requires some amount of work, especially proxying and create network waypoints.
If you have some experiences with tools evaluated or disagree with the points, please use the comment section to share your view and don’t forget to like and follow me on Twitter!

Processing…
Success! You're on the list.

CD with Spinnaker – evaluation

Spinnaker one of the popular continuous delivery platform originally developed in Netflix. I am evaluating a version 1.23.5 . Spinnaker is a multi-cloud continuous delivery platform supporting VM and Kubernetes based deployments (server-less under development). Extensible platform with HA setup possible. This post is supposed to be part of the bigger series with a unified structure.

Overview:
Spinnaker Architecture
Spinnaker basic concepts (Spinnaker started for VM deployments, Kubernetes concepts mapped to it in provider)
Pipeline stages
– Support for manual Judgement stage though no detailed permission model for actions (non OSS plugins exists e.g. Armory)
– Nesting pipeline supported (either fire and forget or wait for completion)
Custom stages development (Rest call, Kubernetes job or Jenkins job, …)
– Development of new stage

Authentication & Authorisation (Spinnaker security concepts):
Spinnaker Authentication
Spinnaker Authorisation with Role Based Access
– Spinnaker can be accessed through GCP Identity Aware Proxy (or other service on different cloud providers)
– Authentication G-Suite identity provider or GitHub teams. Other options exist as well, see overview here.
– Authorisation with Google Groups (only support flat structure, role = name of the group), GitHub teams , raw mapping or others
Pipelines are versioned automatically
Pipeline triggers
– Concept of providers which integrates pipelines with target platform or cloud providers, e.g. Kubernetes provider v2
– Support for complex deployment strategies
– Management CLI – Halyard (spinnaker configuration) and Spinn for pipeline management
– Deployment to Kubernetes in the form of native manifests, Helm packages transformed in Helm Bake Stage to native manifests (using native Helm support for templating)
– Terraform stage as a custom stage e.g. oss implementation
– Wide variety of notification options
– Monitoring support via Prometheus
Backup configuration to storage

Pricing:
– There is no price for Spinnaker itself only for resources consumed when deployed
– Requires VMs, Redis or CloudSql(Postgress)
– Loadbalancer
Spinnaker for GCP if you are running on GCP, where you pay for resources needed only.

Resources:
https://spinnaker.io/
https://www.slideshare.net/Pivotal/modern-devops-with-spinnaker-olga-kundzich
https://spinnaker.io/concepts/ebook/

Summary:
Tool with focus on CD with manual approval stages, security model which makes it SOC2 compliant. Good audit-ability in place (possible to integrate to GCP audit log). Scripted stages and manual approval stage is possible to specify just a group. It is done on application/ pipeline level. Tool eliminate Helm from kubernetes cluster as it works based on Kubernetes native manifest. Propagates Immutable infrastructure as those artefacts are stored for possible rollbacks.  Authorisation/Authentication seems to be a complex but variable to integrate with wide variety of the systems. Pretty active user group, offering help. Pricing is based on resources used.

CI/CD with GitLab – evaluation

GitLab one of the popular DevOps platform out there, currently. I am evaluating a version GitLab 13.7-pre- release features. This post is supposed to be part of the bigger series with a unified structure. Evaluation in the context of existing infrastructure GitHub + Prometheus + Grafana.

High level overview: 

Authentication/Authorisation:

CI/CD capabilities:

Pricing:

  • Has the concept of minutes in the plan + buying extra ($10 per 1000min)
  • Pay for the storage $60/10GB  see details
  • Based on my understanding, we need at least Premium $19/user/month.
  • GitLab pricing

I haven’t studied GitLab offering super profoundly, but for building a new project, I would consider starting with it as it provides complete SDLC support (compared to Spinnaker it is CI + CD). Acts as SDLC management on top of the cloud provider – providing an easy way how to comply with the majority of measures from certification, e.g. SOC 2, but those are the gold plan features ($99/user/month). This might be pricy, but if you use ticket management, documentation (instead of, e.g. Jira), roadmap tooling, release notes management, Terrafrom stage seems like a no-brainer!

I see the following challenges:

  • Pipeline deployment ordering as parallel pipelines run
  • Shared runners are small machines step to registered add admin infra work
  • A security model is similar to Spinnaker, additionally doesn’t allow custom groups, but I guess that you can create custom apps (users)
  • Pricing seems scary at the end runners probably run on your infra and registered to the platform, OTOH if managed to keep on shared runners, need to buy a lot of build minutes. 
  • Storage cost seems high 
  • Docker registry has 30 days expiry (probably can be extended) => you will be uploading to your GCR

I haven’t studied in deep deployment capabilities:

  • Integration with Helm – probably rendering via helm template and then deploy
  • Support for deployment strategies – requires appropriate kubernetes object manifests as everywhere
  • Registered kubernetes seems to have an agent running in them
  • Has all concepts from Spinnaker more less
  • Has starting support for Terraform in alpha

Potential pain points:

  • Having a whole pipeline in git(including deployment strategies configurations, approvals) – might pose challenges when there is no pure trunk-based development – requires a need for backporting and harder for surveillance. 

GitLab is built on top of plenty of OS projects where I can imagine that integration between your infrastructure and GL might be extensive.

The only reasonable scenario that you fully migrate to GitLab and reduce extra tooling like Assana, GitHub, Confluence, … or for new projects that might be a no-brainer. That migration can be pretty heavy, but you might get some compliance checks for that in a single workspace. 

Resources:

CI with GCP Cloud Build – evaluation

Cloud Build on of the services available on Google Cloud Platform. Evaluation happened January 2021 and I believe that is still improving. This post is supposed to be part of the bigger series with a unified structure.

Overview:

  • Even though Cloud Build labels itself as CI/CD tool it lacks the CD features (e.g. deployment strategies, manual approval stages etc.) – nobody prevents you from developing those
  • Run in GCP or has some support for local execution as well
  • Build using wiring Docker containers together. Executed on single VM, you can upscale VM to high cpu machines up to 32cpu. 

Continous Integration features:

Pricing:

Summary:

Purely CI system with capability to build (~ Cloud Build). No triggers for time based related things. So either Event based (commit, tag, …) or manual trigger. Probably could be emulated via Cloud Function to trigger to simulate Time Based Trigger. Has ability to run locally which is nice. Scales up to 32cpu machines. Prices based on build time (clock time). Doesn’t offer Approval stages, security model based on IAM and seems that you cannot grant permission on particular configuration/build. Doesn’t have concept of pipeline – but rather set of tasks steps(stages). Definition lives in Git – so LTS branches should be buildable. To have full end-2-end deployment, you need a CD system. This system manages just “build artefact”. 

CI/CD with Jenkins – evaluation

Jenkins Evaluation happened January 2021 and I believe that Jenkins is still improving. This post is supposed to be part of the bigger series with a unified structure.

Overview:
Pipeline definition completely lives in GIT together with code ~> Jenkinsfile
– Support for jenkinsfile via graddle DSL
You can chain the pipelines
– Single pipeline triggered on various branches ~> Multi-branch pipelines (tutorial)
Parallel pipeline stages
– Access to build meta-data (e.g.  build-number, commit hash, …)
Jenkins as a code plugin
– Managing secrets via secrets plugin
Audit trail plugin
Try notifier
– Better UI with Blue Ocean
– Tooling – Jenkins Job Builder (Job builder tutorial)
Pull-request Jenkins pipeline
– Deployment topology – master x slave/agent
Jenkins Helm deployment – seems has autoscaling agents – based on Configuration as a code plugin
– Manual approvals – seems as not so straightforward via input option
Jenkins on Google Kubernetes Engine

Security model:
– Default has no roles – all has single view -> plugins
GitHub OAuth and here
Role base authorisation plugin –  (strategy plugin – role) – that probably doesn’t work together with gitHub OAuth, but can work with Matrix access

Resources:
Jenkins for beginners

Summary:
Jenkins – one of the most popular open source CI/CD systems. Necessary to be self-hosted. But even Kubernetes plugin seems to have agent autoscaling capabilities which should be cost effective. Seems that whole Jenkins configuration can be bootstrapped from code. 

Security model has various options not sure how all fits together e.g. gitHub OAuth + Roles and Securities but there is multiple ways e.g. control matrix. 

Has concept of pipelines and jobs. Pipelines are next generation where they live completely in code-base ~> LTS should be ok. Seems that have some basic manual approvals stages, question how that goes together with auth. Has concept of multi-branch jobs/pipelines = single definition for whole bunch of branches where definition is dynamically taken from source. 

CD capabilities are somewhat simplistic – no advanced release strategies. Like roll back, monitoring etc. That would need to be scripted probably.