Hey there! Welcome to Platform Weekly. Your weekly bowl of platform engineering popcorn. This week we’ve got another guest newsletter from Lou Bichard from Gitpod breaking down their recent viral article. Will I get as many angry replies after this one?Let’s dive in and see.

We left Kubernetes

Yep — you read that right. To us, Kubernetes seemed like the obvious choice when we set out to build remotely accessible standardized and automated development environments. 

And after having spent six years invested in making a cloud development environment platform that serves thousands of development environments per day, we found that Kubernetes is not the right choice for building development environments.

Last week, we published a detailed post covering our six-year journey.

This is a good tl;dr; from a comment that accurately summarizes the whole post well:

"The problem with 'development environments', like other interactive workloads, is that there is a human at the other end that desires a good interactive experience with every keypress. It's a radically different problem space than what k8s was designed for."

Here are some of the main highlights from the article:

Why development environments are different

  • They're highly stateful with source code, build caches, and test data because developers don’t take kindly to losing source code or work progress.
  • They require unpredictable resource usage and bursting - development environments idle one moment and then need several CPU cores within milliseconds. 
  • They demand root access and extensive permissions to help with package installation and general system configuration.

Some of the technical challenges we faced:

  • Unpredictable resource usage - Development environments won't need much CPU bandwidth most of the time, but will require several cores within a few 100ms. We solved this using custom CFS and process priority based control loops built on cgroupv1.
  • Startup performance is critical for experience - We addressed some startup time issues using SSD RAID 0 which gave us high IOPS and bandwidth. Our experiments with block storage and PVCs ultimately failed. 
  • Giving users root access essentially provides them with root privileges on the node itself - We solved this by using "user namespaces" which provides fine-grained control over the mapping of user and group IDs inside containers. 

The response to the article was immense, so people understandably have a lot of questions, and doubts about leaving Kubernetes, or just wanted to hear from us how it’s actually working out. So if you want to go deeper, our CTO Chris, the original blog post author, put together a technical deep-dive session to cover what can't easily be broken down in an article.