Who monitors Prometheus?

Of course, I am not talking about Greek mythology but about popular monitoring tool Prometheus  based on Google’s internal monitoring tool called Borgmon.

As some teams invest significant effort in making their solution transparent, easy to troubleshoot and stable by instrumenting their product and technology which is then connected to their monitoring system the only question remains: Who monitors the monitoring? If the monitoring is down you won’t get any alarms about the malfunction of the product and you are at the begging. So you add monitoring of a monitoring system. Availability tough game starts. If both components have 95% availability you effectively achieve 90%. I believe that you got the rules of the game we play.

How it is monitoring of monitoring setup in practice? Well, the implementation differs. But the core idea is the same and works well even in the army! It is called Dead man switch. It is based on the idea that if we are supposed to receive a signal for triggering an alarm in an unknown moment we need to guarantee that the signal can trigger an alarm anytime. Reversing the logic for triggering the alarm will give us that guarantee so having a signal which we receive constantly and the alarm is triggered when we don’t receive a signal. So simple! This principle (heartbeat) is used in multiple places e.g. clustering. In the army, they use it as well.

dead-man

Some monitoring tools they have this capability built-in but Prometheus doesn’t. So how to achieve this in order to sleep well that the watcher is watching. We need to set up a rule that is constantly firing in Prometheus. There can be a rule like this:

    - name: monitoring-dead-man
      rules:
      - alert: "Monitoring_dead_man"
        expr: vector(1)
        labels:
          service: deadman
        annotations:
          summary: "Monitoring dead man switch should always fire alert"
          description: "Monitoring dead man switch for probing alert path"

Now we need to create a heart beating. The rule on its own would fire once then it would be propagated to Prometheus Alert Manager (component responsible for managing alerts) and all would be over. We need a regular interval for our check-ins. Interval is given by availability you want to achieve as that all adds to reaction time you need. You can achieve this behaviour by special route for your alert in an Alert Manager:

      - receiver: 'DEAD-MAN-SNITCH'
        match:
          service: deadman
        repeat_interval: 5m

Now we need to achieve an alarm trigger reverse logic. In our particular case, we use Dead Man Snitch  which is great for monitoring batch jobs e.g. data import to your database. It works in a way that if you do not check in in the specified interval it triggers with the lead time given by interval. You can specify the rules when to trigger but those are the details of the service you use. All you need to add to Prometheus is a receiver definition checking in particular snitch as follows:

       - name: 'DEAD-MAN-SNITCH'
          webhook_configs:
            -  url: 'https://nosnch.in/your_snitch_id'

The last thing you need to do is integrate the trigger with the system you use for on-call rotta management e.g. PagerDuty . For this example integration between dead man snitch and pagerduty
As I am not an infra or DevOps guy by hart it took me some time to figure it out and connect the bits. I hope you found it useful. If you have other experiences or the different way how to achieve this behaviour let me know in the comment section below.

Advertisements

Kubernetes Helm features I would wish to know from day one

Kubernetes Helm is a package manager for Kubernetes deployments. It is one of the possible tools for deployment management on Kubernetes platform. You can imagine it as an RPM in Linux world with package management on top of it like an apt-get utility.

Helm release management, ability to install or rollback to a previous revision, is one of the strongest selling points of Helm and together with strong community support makes it an exciting option. Especially the number of prepared packages is amazing and make it extremely easy to bootstrap a tech stack on the kubernetes cluster. But this article is not supposed to be a comparison between the kubernetes tools but instead describing an experience I’ve made while working with it and finding the limitations which for some else might be quite ok but having an ability to use the tool in those scenarios might be an additional benefit.
Helm is written in GO lang with the usage of GO templates which brings some limitation to the tool. Helm works on the level of string literals, and you need to take care of quotation, indentation etc. to form a valid kubernetes deployment manifest. This is a strong design decision from the creators of the Helm, and it is good to be aware of it. Secondly, Helm merges two responsibilities: To render a template and to provide kubernetes manifest. Though you can nest templates or create a template hierarchy, the rendered result must be a Kubernetes manifest which somehow limits possible use cases for the tool. Having the ability to render any template would extend tool capabilities as quite often the kubernetes deployment descriptors contain some configuration where you would appreciate type validation or possibility to render it separately. At the time of writing this article, the Helm version 2.11 didn’t allow that. To achieve this, you can combine Helm with other tools like Jsonnet and tie those together.
While combining different tools, I found following Helm advanced features quite useful which greatly simplified and provide some structure to the resulting Helm template. Following list enumerates those which I found quite useful.

Template nesting (named templates)
Sub-templates can be structured in helper files starting with an underscore, e.g. `_configuration.tpl` as files starting with underscore doesn’t need to contain valid kubernetes manifest.

{{- define "template.to.include" -}}
The content of the template
{{- end -}}

To use the sub-template, you can use

{{ include "template.to.include" .  -}}

Where “.” passes the actual context. When there are problems with the context trick with $ will solve the issue.

To include a file all you need is specify a path

{{ $.Files.Get "application.conf" -}}

Embedding file as configuration

{{ (.Files.Glob "application.conf").AsConfig  }}

it generates key-value automatically which is great when declaring a configMap

To provide a reasonable error message and make some config values mandatory, use required function

{{ required "Error message if value not specified" .Values.component.port }}

Accessing a key from value file which contains a dot e.g. application.conf

{{- index .Values.configuration "application.conf" }}

IF statements combining multiple values

{{- if or (.Values.boolean1) (.Values.boolean2) (.Values.boolean3) }}

Wrapping reference to value into braces solves the issue.

I hope that you found tips useful and if you have any suggestions let me know in the comment section below.

Software Deployment – java applications as a RPM linux package

Java applications archives as jar, war and ear files are elementary distribution blocks in the java world. At the beginning managing all of these libraries and components were a bit cumbersome and error prone as the project dependencies depends on another libraries and all those transitive dependencies creates so called dependency hell. In order to ease developers of this burden Apache Maven (maven like tools) were developed. Every artefact has so called coordinates which uniquely identifies it and all dependencies are driven by those coordinates in a recursive fashion.

Maven ease the management at the stage of the artefact development but doesn’t help that much when we want to deploy the application component. Many times that’s not such a big deal if your run time environment is clustered J2EE aplication server e.g. Weblogic cluster. You hand the ear or war over to your ops team and they deploy it to the cluster via cluster management console to all nodes at once. They need to maintain an archive of deployed components in case of roll back etc. This is the simplest case (isolated component and doesn’t solve dependencies e.g. libraries provided in the cluster etc.) where management is relatively clean but relying on the process a lot. When we consider different run time environment like run the application as a server less java process (opposed to J2EE cluster) then the stuff gets a bit more complicated even for a simplest case. Your java applications are typically distributed as a jar file and you need to distribute it to every single linux server where instance of this process is running. Apart of that standard jar file doesn’t contain dependencies. One possible solution to that would be to create shaded (fat) jar file which has all dependencies embedded. I suppose that you have a repository where all builds are archived. Does it make sense to store those big archives where the major part are 3rd party libraries? This is probably not the right way to go.

Another aspect of roll out process is ability to automate it. In case of J2EE clusters like weblogic there is often scripting tool provided (WLST ~ weblogic scripting tool). The land of pure jar is again a lot worse. You can take some advantage of maven but that doesn’t solve all the problems. Majority of production environments in java world run on linux operating system so why not to try to take advantage of linux server standard distribution management like yum, apt etc. for distributing rpm linux packages. This system provides atomicity, dependency management between rpm linux packages, easy way to roll back (keeps track of versions), minimise the number of manual steps – potential of human error is reduced and involves native auditing. It is pretty easy to get an info about installation history.

To pack your java jar file application you need a tool called rpmbuild which creates a linux package from a SPEC file. SPEC file is something like pom in maven world plus it contains instruction how to install, uninstall etc. Packages containing a required plus handy tools are rpmdevtool and rpmlint. On linux OS it is simple to install it. On Windows OS you need a cygwin installed with the same tool set. In order to build your rpm work space run the following command. It is highly recommended not to run it under root account if there is no special need for it.

rpmdev-setuptree

This command creates rpmbuild folder – that is the place where all linux RPM packaging will happen. It contains sub-folders: BUILD, RPMS, SOURCES, SPECS, SRPMS. For us the important ones are RPMS – will contain final linux rpm and SPECS – this is the place we need to put our SPEC file describing installation and content of our application.
This file is the core of the linux rpm packaging. It contains all the information about version, dependencies, installation, un-installation, upgrade etc. We can create our skeleton SPEC file by running following command:

rpmdev-newspec

Majority of directives in this file are clear from its name, e.g. Name, Version, Summary, BuildArch etc. BuildRoot require special attention. It is a sort of proxy which mimics a root of system under the construction e.g. If I want to install my [application_name] (replace this place holder with actual name) to /usr/local/[application_name] location I have to create this structure under BuildRoot during the installation. Then there are important sections which corresponds to various phases of installation: %prep, %build, %install – which is the most important for as as we do not build from sources but just pack already built jar file to linux rpm package. Last very important section of this file is %files which lists all files which will be in the final linux rpm package and hence installed in the target machine. Apart from that there can be additional hooks to installation and un-installation process as %post, %preun, %postun etc. which allows you to customize a process as you need. Sample SPEC file follows:

%define _tmppath /home/virtual/rpmbuild/tmp
Name: [application_name]
Version: 1.0.2
Release: 1%{?dist}
Summary: Processor component which feed data into DB
Group: Applications/System
License: GPL
URL: https://jaksky.wordpress.com/
BuildRoot: %{_topdir}/%{name}-%{version}-%{release}-root
BuildArch: noarch
Requires: jdk >= 7
%description
Component which process incoming messages and store them to DB.

%prep

%build

%install
rm -rf $RPM_BUILD_ROOT
mkdir -p $RPM_BUILD_ROOT/usr/local
cp -r %{_tmppath}/[application_name] $RPM_BUILD_ROOT/usr/local
mkdir -p $RPM_BUILD_ROOT/usr/local/[application_name]/logs
mkdir -p $RPM_BUILD_ROOT/etc/init.d
cp -r %{_tmppath}/[application_name]/bin/[application_name] $RPM_BUILD_ROOT/etc/init.d
mkdir -p $RPM_BUILD_ROOT/var/run/[java application]

%files
%defattr(644,[application_name],[application_name])
%dir %attr(755, [application_name],[application_name])/usr/local/[application_name]
%dir %attr(755,[application_name],[application_name]) /usr/local/[application_name]/lib
/usr/local/[application_name]/lib/*
%attr(755,[application_name],[application_name]) /usr/local/[application_name]/logs
%dir %attr(755,[[application_name],[application_name]) /usr/local/[application_name]/conf
%config /usr/local/[application_name]/conf/[application_name]-config.xml
%config /usr/local/[application_name]/conf/log4j.properties
%dir %attr(755,[application_name],[application_name]) /usr/local/[application_name]/deploy
/usr/local/[application_name]/deploy/*
%doc /usr/local/[application_name]/README.txt
%dir %attr(755,[application_name],[application_name]) /usr/local/[application_name]/bin
%attr(755,[application_name],[application_name]) /usr/local/[application_name]/bin/*
%attr(755,root,root) /etc/init.d/[application_name]
%dir %attr(755,[application_name],[application_name]) /var/run/[application_name]

%changelog
* Wed Nov 13 2013 Jakub Stransky <Jakub.Stransky@jaksky.com> 1.0.2-1
- Bug Fixing wrong messages format
* Wed Nov 13 2013 Jakub Stransky <Jakub.Stransky@jaksky.com> 1.0.1-1
- Bug Fixing wrong messages format
* Mon Nov 11 2013 Jakub Stransky <Jakub.Stransky@jaksky.com> 1.0.0-1
- First relaese of [application_name]

Several things to highlight in the SPEC file example: tmppath points to the location where the installed application is prepared, that is essentially what is going to be packed to rpm package. %defattr set the standard attributes to files if special one are not specified. %config denotes configuration files which means that for the first installation those standard one are provided but in case of upgrade those file will not be overwritten as they are probably customized to this particular instance.
Now we are ready to create the linux rpm package just the last step is pending:

rpmbuild -v -bb --clean SPECS/nameOfTheSpecFile.spec

Created package can be found in RPMS subfolder. We can test the package locally by

rpm -i nameOfTheRpmPackage.rpm

To complete the smoke test lets remove the package by

rpm -e nameOfTheApplication

Creating a SPEC file should be pretty straightforward process and once you create your SPEC file for the application building of linux rpm package is one minute job. But if you want to automate it there is a maven plugin which generates a SPEC file for you. It is essentially wrapper of rpmbuild utility which means that plugin works fine on linux with tool set installed but on windows machine you need have cygwin installed and create wrapper bat file to mimic rpmbuild utility for the plugin. Detailed manual can be found for example here.

Couple things to highlight when creating a SPEC file. Prepare the linux package for all scenarios – install, remove, upgrade and configuration management right from the beginning. Test it properly. It can save you a lot of troubles and manual work in case of large installations. Creating a new version of java application is only about about replacing jar file, re-packaging rpm bundle.

In this quick walk through I tried to show that creating of linux rpm package as a unit for software deployment of the java application is not that difficult and can neaten a roll out process. I just scratch the surface of linux rpm packaging and I was far away from showing all capabilities of this approach. I will conclude this post by several links which I found really useful.

Great tutorial on RPM packaging in general
Good rpmbuld manual pages
Maven rpm plugin
Maximum RPM book