ContainIQ (YC S21) – Kubernetes Native Monitoring with eBPF

Hi HN, I’m Nate, and here together with my co-founder Matt, we are the founders of ContainIQ (https://www.containiq.com/). ContainIQ is a complete K8s monitoring solution that is easy to set up and maintain and provides a comprehensive view of cluster health.

Over the last few years, we noticed a shift that more of our friends and other founders were using Kubernetes earlier on. (Whether or not they actually need it so early is not as clear, but that’s a point for another discussion.) From our past experience using open-source tooling and other platforms on the market, we knew that the existing tooling out there wasn’t built for this generation of companies building with Kubernetes.

Many early to middle-market tech companies don’t have the resources to manage and maintain a bunch of disparate monitoring tools, and most engineering teams don’t know how to use them. But when scaling, engineering teams do know that they need to monitor cluster health and core metrics, or else end users will suffer. Measuring HTTP response latency by URL path, in particular, is important for many companies, but can be time-consuming to install application packages for each individual microservice.

We decided to build a solution that was easy to set up and maintain. Our goal was to get users 95% of the way there almost instantly.

Today, our Kubernetes monitoring platform has four core features: (1) metrics: CPU and memory for pods/nodes, view limits, capacity, and correlate to events, alert on changes; (2) events: K8s events dashboard, correlate to logs, alerts; (3) latency: monitor RPS, p95, and p99 latencies by microservices, including by URL path, alerts; and (4) logs: container level log storage and search.

Our latency feature set was built using a technology called eBPF. BPF, or the Berkeley Packet Filter, was developed from a need to filter network packets in order to minimize unnecessary packet copies from the kernel space to the user space. Since version 3.18, the Linux kernel provides extended BPF, or eBPF, which uses 64-bit registers and increases the number of registers from two to ten. We install the necessary kernel headers for users automatically.

With eBPF, we are monitoring from the kernel and OS level, and not at the application level. Our users can measure and monitor HTTP response latency across all of their microservices and URL paths, as long as their kernel version is supported. We are able to deliver this experience immediately by parsing the network packet from the socket directly. We then correlate the socket and sk_buff information to your Kubernetes pods to provide metrics like requests per second, p95, and p99 latency at the path and microservice level, without you having to instrument each microservice at the application level. For example with ContainIQ, you can track how long your node.js application is taking to respond to HTTP requests from your users, ultimately allowing you to see which parts of your web application are slowest and alerting you when users are experiencing slowdowns.

Users can correlate events to logs and metrics in one view. We knew how annoying it was to toggle between multiple tabs and then scroll endlessly through logs trying to match up timestamps. We fixed this. For example, a user can click from an event (ex a pod dying) to the logs at that point in time.

Users can set alerts across really all data points (ex. p95 latency, a K8s job failing, a pod eviction).

Installation is straightforward either using helm or with our YAML files.

Pricing is $20 per node / month + $1 per GB of log data ingested. You can sign up on our website directly with the self-service flow. You can also book a demo if you would like to talk to us, but that isn’t required. Here are some videos (https://www.containiq.com/kubernetes-monitoring) if you are curious to see our UX before signing up.

We know that we have a lot of work left to do. And we welcome your suggestions, comments, and feedback. Thank you!



Get Top 5 Posts of the Week



best of all time best of today best of yesterday best of this week best of this month best of last month best of this year best of 2023 best of 2022 yc w24 yc s23 yc w23 yc s22 yc w22 yc s21 yc w21 yc s20 yc w20 yc s19 yc w19 yc s18 yc w18 yc all-time 3d algorithms animation android [ai] artificial-intelligence api augmented-reality big data bitcoin blockchain book bootstrap bot css c chart chess chrome extension cli command line compiler crypto covid-19 cryptography data deep learning elexir ether excel framework game git go html ios iphone java js javascript jobs kubernetes learn linux lisp mac machine-learning most successful neural net nft node optimisation parser performance privacy python raspberry pi react retro review my ruby rust saas scraper security sql tensor flow terminal travel virtual reality visualisation vue windows web3 young talents


andrey azimov by Andrey Azimov