Last year we were migrating our on‑premise infrastructure to Google Cloud.
Our migration model was essentially a Lift & Shift: take the same applications that were running on our physical servers, already packaged in Docker containers, and run them with almost no changes inside virtual machines (VMs) in GCP.
On paper, it was a simple migration: same images, same application configuration, same docker-compose, only the place where the VMs ran was different.
In practice, everything worked fine… until we started testing external connectivity from inside the containers.
Symptoms
The first signs that something was off were:
-
Connectivity worked from the VM, but not from Docker
Runningcurlorpingfrom the VM worked fine, but the same command from a container failed or hung. -
HTTP worked, HTTPS didn’t (or was painfully slow)
httprequests responded quickly, buthttpsrequests suffered from timeouts, failed handshakes, or extreme latency. -
Intermittent TLS handshake errors
In the logs we saw errors liketls handshake timeout, or connections simply closing without much explanation.
From the outside, everything pointed to “a GCP networking issue” or “something broken in Docker”. But the root cause was more specific: the MTU.
Understanding MTU
The MTU (Maximum Transmission Unit) is the maximum size (in bytes) of a packet that can travel over a network interface without being fragmented.
In many traditional networks it’s 1500, but cloud providers (including GCP) often use slightly smaller values to account for internal encapsulations.
In our case:
- Default Docker bridge: MTU =
1500 - Google Cloud VPC: MTU =
1460
That 40‑byte difference looks small, but it’s enough to cause:
- Docker to try to send 1500‑byte packets.
- The GCP network to only accept up to 1460 bytes without fragmentation.
- With Path MTU Discovery and/or ICMP blocked somewhere, larger packets (like those used during a TLS handshake) to get lost or stuck.
Result: HTTP (smaller packets) works, HTTPS (larger packets) breaks.
How we diagnosed it
The key step was comparing the network configuration inside the VM versus inside a Docker container.
# Inside the VM in GCP
ip link show eth0 | grep mtu
# Output (example)
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc fq_codel state UP ...
# Inside a Docker container
docker exec -it app bash
ip link show eth0 | grep mtu
# Output (example)
# 3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...That’s where we clearly saw the issue: Docker was using MTU 1500 on top of a network whose real MTU was 1460.
Another useful command was inspecting the Docker bridge network:
docker network inspect bridge | jq '.[0].Options'If the MTU is not explicitly configured, Docker falls back to its default (1500).
Fix: align Docker’s MTU with GCP
The fix was to create Docker networks with MTU 1460, aligned with the GCP VPC.
There are several ways to do this, depending on how you manage Docker on your VMs.
1. Set a global MTU in daemon.json
On hosts where you control dockerd, you can set the default MTU in /etc/docker/daemon.json:
{
"mtu": 1460
}Then restart the service:
sudo systemctl restart dockerAll bridge networks created after that will respect this MTU.
2. Create a Docker network with a specific MTU
If you prefer not to touch the global configuration, you can create a dedicated network for your services:
docker network create \
--driver bridge \
--opt com.docker.network.driver.mtu=1460 \
app-networkThen attach your containers to app-network instead of the default bridge network.
3. Docker Compose example (MTU configuration code)
If you manage your services with Docker Compose, you can define the MTU directly in the networks section:
version: "3.9"
services:
web:
image: my-app:latest
ports:
- "80:80"
- "443:443"
networks:
- app-network
networks:
app-network:
driver: bridge
driver_opts:
com.docker.network.driver.mtu: 1460With this configuration:
- All containers attached to
app-networkuse MTU 1460. - Outbound traffic no longer suffers from oversized packets.
httpsrequests stop hanging and TLS handshakes become stable again.
Verification
After applying the change, we repeated our tests:
curl https://google.comfrom inside the VMcurl https://google.comfrom inside the container- Production requests to external APIs
- No more handshake‑related errors in the logs
The only real change was aligning Docker’s MTU with the GCP VPC MTU.
Lessons learned
- Don’t assume MTU is always 1500, especially in the cloud.
- When something works from the VM but not from Docker, look at MTU, routes, and NAT, not just DNS or firewalls.
- Documenting these infrastructure “gotchas” saves hours (or days) in future migrations.
In our case, a detail as small as the size of a network packet was the difference between a migration that was “technically done but flaky” and a platform that was actually reliable on Google Cloud.