Very few things frustrate me more than administrative roadblocks that slow me down or make it more difficult to do work. I want to get from staging to production with as little interference as possible. The question every engineering team faces is how to allow that without compromising security?
That’s the challenge of network segmentation. The goal is a segmentation strategy that creates enough segmentation between systems that you can avoid an error in staging from accidentally polluting production data or in the more extreme case, prevent an attacker from compromising your entire network based on one bad configuration or compression. With any security control, the tricky part is implementing a security strategy that has the necessary security measures without creating so much friction that engineers and end users end up frustrated.
Today datacenters are designed as a stretched LAN extended to cloud systems. That makes routing over VPN in a WAN architecture more complex and requires you to re-think your staging and production environments as they now are accessed over long distance links without creating frustrating latency.
All cloud systems allow you to create your own network bubble in their system. The problem is connecting those systems to your on premise datacenter without compromising segmentation or logical links.
The usual approach to connect cloud and on-prem is via VPNs. The downside of a VPN is that it is all or nothing. You either have access or you do not. If all your network traffic flows through the same tunnel, then there’s a way to communicate between your prod and staging areas, which compromises segmentation and creates risk.
Creating a separate tunnel per environment is the obvious solution. Often, teams create redundant tunnels to ensure failover.
In this scenario, the network architecture schema would look like this:
Users to servers arrows only represent administrative access (ssh, RDP, database)
This setup routes users to staging or production and let machines talk together with minimal overhead. The VPN endpoints in the corporate datacenter may exist on the same appliance. The important thing is to create logical separation for each tunnel.
Access filtering is done centrally, which may be more or less easy depending on how fine grained your security policy needs to be and the authentication and authorization configuration on the target systems.
The drawback with this setup is that the firewall rules list don’t scale cleanly and can quickly start to impact performance. In all likelihood, your access controls will end up being more permissive than you want in order to keep the rules list in a manageable size.
I’ve used strongDM to enforce network segmentation in a way that doesn’t create roadblocks for engineers who require access across systems.
So how does strongDM work? A strongDM deployment architecture could be represented like this:
Think of strongDM as a software defined network that routes database queries, SSH, RDP & kubectl commands through a gateway(s) to any database, server or kubernetes cluster.
The gateway acts as an entry point to your network. Redundant copies can be deployed to support HA. When multiple gateways serve the same infrastructure, traffic is automatically routed along the lowest latency path.
You can drive traffic through your existing VPN, but the gateway is hardened and can be exposed directly to the internet. In this case, I allowed direct internet access.
The strongDM API is egress only from your network in order to authenticate a valid session. All traffic is routed exclusively within your network.
Once installed a client can connect to an instance (ssh, RDP or database protocol) through the gateway.
The main advantages are:
- Least Privilege By Default. strongDM allows us to extend RBAC to grant the right permissions to databases, servers & kubernetes clusters. There’s no need to share credentials across teams. We can even grant temporary access for on-call teams. We can accomplish all this without complicating firewall rules.
- Narrow your attack surface. From a security perspective, once someone’s on your VPN, they have full access to the network environment which opens up several vulnerabilities. If the computer they use to access a database or server has been compromised by malware, there’s a possibility it could be used by an attacker to gain access to the broader internal network. A solution like strongDM isolates access to the specific database or server permissions assigned to that user.
- Less headache to revoke access. If we need to revoke access for any reason, strongDM provides the ability to cut off a user through our single sign on so we don’t have to go into each database and server to find which credentials and keys the user has. We use strongDM as a choke point.
- Forensic audit trail means faster incident investigation. Most VPN, SSO or bastion host logs will tell you that a session occurred, but not what sensitive data was read or accessed during that session. With strongDM we get full auditability into everything a person does- when they connect, what commands they type, what data they retrieve: we’re able to see everything.
So, how can segmenting be done so that you secure access to sensitive systems without creating such a headache for engineers that they bombard you over Slack (or worse, poke holes through network security)?
In the past, the answer would have been to manage multiple VPNs. But, that introduces a bunch of problems:
- doesn’t scale (you end up becoming more permissive to keep the rules list to a manageable size)
- performance impact
- not enough of an audit trail (can’t tell who issued each query or command, only that a session occurred)
A software defined network like strongDM is easy to adopt and maintain. Makes it easy to:
- enforce least privilege by default (users inherit role based access controls)
- narrow threat surface (access restricted to specific database/server permissions, not the broader network)
- forensic audit trail (strongDM logs every query, SSH, RDP, & kubectl command)