Published on: August 17, 2022
6 min read
Until recently we used OpenSSH Server to handle SSH connections to provide SSH-related features, but we ultimately decided to implement our own SSHD solution. Learn more!
The story of why we moved to our own SSHD is an interesting one. GitLab provides a number of features that execute via an SSH connection. The most popular one is Git-over-SSH, which enables communicating with a Git server via SSH. Historically, we implemented the features through a combination of OpenSSH Server and a separate component, a binary called GitLab Shell. GitLab Shell processes every connection established by OpenSSH Server to communicate data back and forth between the SSH client and the Git server. The solution was battle-tested, and relied on a trusted component such as OpenSSH. Here's why we decided to implement our own SSHD.
Everyone can contribute at GitLab! A community contribution from @lorenz, gitlab-sshd
, was suggested as a lightweight alternative to our existing setup. A self-contained binary with minimal external dependencies would be beneficial for containerized deployments. A GitLab-supported replacement also opened up new opportunities:
git clone
operations. With a dedicated server, the connections now become manageable and can be shut down gracefully: the server listens for an interrupting signal and, when the signal is received, stops accepting new connections and waits for a grace period before shutting down completely. This grace period gives ongoing connections an opportunity to finish.gitlab-sshd
it became possible to introduce a Go profiler to surface performance problems, which was a significant improvement from an operating perspective.gitlab-sshd
, any unpredictable call to an OpenSSH feature that we don’t support is no longer possible, dramatically reducing the attack surface.However, changing a critical component that is broadly used, and is responsible for security, carries tremendous risks. We experienced both challenges and risks:
A component with a scope this broad could have a wide range of problems. We encountered, and resolved, the following problems:
gitlab-sshd
deployment consumed huge amounts of memory. It interacted negatively with another feature under development at the same time. We must always keep the interaction with other components in mind when introducing a general component.golang.org/x/crypto
library: This library establishes SSH connection, and has limited support for algorithms and features available in OpenSSH. We created our own fork to provide the missing features:
server-sig-algs
extension, but golang.org/x/crypto
didn't support it. We started supporting this extension.hmac-sha2-512
and hmac-sha2-256
are the most noticeable. We started supporting these algorithms.gpg-agent v2.2.4
and OpenSSH v7.6
shipped in Ubuntu 18.04
, might send ssh-rsa-512
as the public key algorithm but actually include a rsa-sha
signature. We had to relax the RSA signature check to resolve this issue.LoginGraceTime
and ClientAliveInterval
were unavailable, so we implemented multiple alternatives to preserve the features we needed.Unfortunately, issues became visible on production environment, thanks both to the load and the variety of possible OpenSSH configurations. Even though we caught some bugs on our staging environments, predicting all types of problems was almost impossible. However, these actions helped us resolve the issues: